Python MySQL encoding Error: 'latin-1' codec can't encode character: ordinal not in range(256)

Question

I have been stuck for a while with a UnicodeEncodeError in Python.

Here is what I am doing:

I create a Dataframe as a result of a various analysis. In total, the dataframe has 30 columns with multiple types of values (int,string,datetime,etc).
I create an SSH connection to a remote instance in Azure where I have installed MySQL. I create the connection using SQLAlchemy.
I run the df.to_sql command and get the following error

UnicodeEncodeError: 'latin-1' codec can't encode character u'\u2013' in position 8: ordinal not in range(256)

I tried doing this but it didn't seem to work.

engine = create_engine('mysql+pymysql://user:pwd@host:%s/db?charset=utf8' % server.local_bind_port)

I have read here that I can use u.encode('latin-1', 'replace'). But would I need to perform that and go through every String column and encode it? Or is there something else that I can do?

@pshep123 - In Azure I am using Python 2.7.12 - In my local PC 2.7.13 Anaconda 4.4.0 — Max Payne, Jun 02 '17 at 00:44
Thanks. Unfortunately I'm not going to be able to help you, but I've been running into unicode issues myself and through my recent research, have come to realize that python 3 and python 2 handle text formatting differently and thus it's important for those more knowledgeable than I to know which version. Here is some reading in the meantime: https://docs.python.org/2/howto/unicode.html, might help. — elPastor, Jun 02 '17 at 00:47

score 0 · Answer 1 · answered Jun 20 '17 at 04:43

This is the solution that I came up with.

I created a function that encoded the different characters in my data.

def custom_encoder(x):
    #Check if the value is Unicode
    if type(x)==type(u''):
        return x.encode('utf8','ignore')
    else:
        return x

The I looped through all the columns and encoded all the values. After this, MySQL allowed the data to be written.

Python MySQL encoding Error: 'latin-1' codec can't encode character: ordinal not in range(256)

1 Answers1