I have a pandas dataframe wher eone of the columns is ratedby, the values are male or female. My goal is to create 2 columns with OneHotEncoder (ratedbymale, ratedbyfemale) with values 1 or 0 appropriately.
I am using Azure ML Designer, with the Execute Python Script componen which takes a dataframe as a parameter and then it can output 2 dataframes
The code I entered is:
# The script MUST contain a function named azureml_main
# which is the entry point for this module.
# imports up here can be used to
import pandas as pd
from sklearn.preprocessing import OneHotEncoder
# The entry point function MUST have two input arguments.
# If the input port is not connected, the corresponding
# dataframe argument will be None.
# Param<dataframe1>: a pandas.DataFrame
# Param<dataframe2>: a pandas.DataFrame
def azureml_main(dataframe1 = None, dataframe2 = None):
# Execution logic goes here
print(f'Input pandas.DataFrame #1: {dataframe1}')
# If a zip file is connected to the third input port,
# it is unzipped under "./Script Bundle". This directory is added
# to sys.path. Therefore, if your zip file contains a Python file
# mymodule.py you can import it using:
# import mymodule
# Return value must be of a sequence of pandas.DataFrame
# E.g.
# - Single return value: return dataframe1,
# - Two return values: return dataframe1, dataframe2
enc = OneHotEncoder(handle_unknown='ignore')
onehotencoder_df = pd.DataFrame(enc.fit_transform(dataframe1[['ratedby']]))
dataframe1.join(onehotencoder_df)
return dataframe1, onehotencoder_df
However I am having this error:
AmlExceptionMessage:User program failed with InvalidDatasetError: Result dataset2 contains invalid data, ('Could not convert (0, 1)\t1.0 with type csr_matrix: did not recognize Python value type when inferring an Arrow data type', 'Conversion failed for column 0 with type object').
ModuleExceptionMessage:InvalidDataset: Result dataset2 contains invalid data, ('Could not convert (0, 1)\t1.0 with type csr_matrix: did not recognize Python value type when inferring an Arrow data type', 'Conversion failed for column 0 with type object').