Skip to content Skip to sidebar Skip to footer

Decode Pandas Dataframe

i have a encoded dataframe. I encode it with the labelEncoder from scitkit-learn, create a machine learning model and done some predictions. But now i cannot decode the values in t

Solution 1:

You can look at my answer here to know the proper usage of LabelEncoder for multiple columns:-

Why does sklearn preprocessing LabelEncoder inverse_transform apply from only one column?

The explanation is that LabelEncoder only supports single dimension as input. So for each column, you need to have a different labelEncoder object which can then be used to inverse transform that particular column only.

You can use a dictionary of labelencoder objects for convertig multiple columns. Something like this:

labelencoder_dict = {}
for col in b.columns:
    labelEncoder = preprocessing.LabelEncoder()
    b[col] = labelEncoder.fit_transform(b[col])
    labelencoder_dict[col]=labelEncoder

While decoding, you can just use:

for col in b.columns:
    b[col] = labelencoder_dict[col].inverse_transform(b[col])

Update:-

Now that you have added the column which you are using as y, here's how you can decode it (assuming you have added the 'Predicted_Values' column to the dataframe):

for col in b.columns:
    # Skip the predicted column here
    if col != 'Predicted_valu‌​es':
        b[col] = labelencoder_dict[col].inverse_transform(b[col])

# Use the original `y (Activity_Profile)` encoder on predicted data
b['Predicted_valu‌​es'] = labelencoder_dict['Activity_Profile'].inverse_transfo‌​rm(
                                                      b['Predicted_valu‌​es']) 

Post a Comment for "Decode Pandas Dataframe"