I have feature => city
which is categorical data i.e string but instead of hardcoding using replace()
is there any smart approach ?
train['city'].unique()
Output: ['city_149', 'city_83', 'city_16', 'city_64', 'city_100', 'city_21',
'city_114', 'city_103', 'city_97', 'city_160', 'city_65',
'city_90', 'city_75', 'city_136', 'city_159', 'city_67', 'city_28',
'city_10', 'city_73', 'city_76', 'city_104', 'city_27', 'city_30',
'city_61', 'city_99', 'city_41', 'city_142', 'city_9', 'city_116',
'city_128', 'city_74', 'city_69', 'city_1', 'city_176', 'city_40',
'city_123', 'city_152', 'city_165', 'city_89', 'city_36', .......]
What I was trying :
train.replace(['city_149', 'city_83', 'city_16', 'city_64', 'city_100', 'city_21',
'city_114', 'city_103', 'city_97', 'city_160', 'city_65',
'city_90', 'city_75', 'city_136', 'city_159', 'city_67', 'city_28',
'city_10', 'city_73', 'city_76', 'city_104', 'city_27', 'city_30',
'city_61', 'city_99', 'city_41', 'city_142', 'city_9', 'city_116',
'city_128', 'city_74', 'city_69', 'city_1', 'city_176', 'city_40',
'city_123', 'city_152', 'city_165', 'city_89', 'city_36', .......], [1,2,3,4,5,6,7,8,9....], inplace=True)
Is there any better way to convert the data into numerical ? Because the number of unique values are 123
.
So I need to hard code numbers from 1,2,3,4,...123 to convert it. Suggest some better way to convert it into numerical value.