0

Take for example this dataframe:

  a b c 
0 1 2 3 
1 1 2 3 
2 2 1 1 
3 2 0 0 

I have the averages of column c for a given label 'a', as follows:

average_c = df.groupby(['a'])['c'].mean()

I want to add a new column 'd' which takes the difference between the value in column c and the average for the label it belongs to.

I.e.:

  a b c d
0 1 2 3 0
1 1 2 3 0
2 2 1 1 0.5
3 2 0 0 -0.5

I can construct arrays and then add the column using iteration, but my intuition tells me there is a way to do this in a more sophisticated fashion.

I duplicate the column with

df['d'] = df['c']

and I assume I need to include some operation here like -average_c['a'] but I'm a bit lost at this point.

wjandrea
  • 28,235
  • 9
  • 60
  • 81
  • Just to be clear for the linked question: `df['d'] = df['c'] - df.groupby('a')['c'].transform('mean')` – wjandrea Mar 04 '23 at 18:35

0 Answers0