0

I have a pandas datafram df that contains a column say x, and I would like to create another column out of x which is a value_count of each item in x.

Here is my approach

x_counts= []

for item in df['x']:
    item_count = len(df[df['x']==item])
    x_counts.append(item_count)
    
df['x_count'] = x_counts

This works but this is far inefficient. I am looking for a more efficient way to handle this. Your approach and recommendations are highly appreciated

JA-pythonista
  • 1,225
  • 1
  • 21
  • 44

1 Answers1

1

It sounds like you are looking for groupby function that you are trying to get the count of items in x There are many other function driven methods but they may differ in different versions. I suppose that you are looking to join the same elements and find their sum

df.loc[:,'x_count']=1 # This will make a new column of x_count to each row with value 1 in it 
aggregate_functions={"x_count":"sum"}
df=df.groupby(["x"],as_index=False,sort=False).aggregate(aggregate_functions) # as_index and sort functions will allow you to choose x separately otherwise it would conside the x column as index column

Hope it heps.

Syed Bilal Ali
  • 124
  • 1
  • 10