Python: Conditional sum of rows in dataframe

Asked Feb 01 '18 at 12:36

Active Feb 01 '18 at 13:25

Viewed 63 times

Thanks in advance for your assistance.

Here is the logic that I'm trying to implement: Where exchange, ticker and year match, sum div_amt into a new column called annual_div. I'm importing the data from a CSV and doing the following:

# Change to datetime format
ZACKS_DH_df['year'] = pd.to_datetime(ZACKS_DH_df['div_ex_date']).dt.year

# Sum annual dividend
ZACKS_DH_df['annual_div'] = ZACKS_DH_df.groupby('exchange' and 'ticker' and 'year').sum()['div_amt']

I've included a screenshot of the output I am getting. As you can see I am getting NaN in the annual_div column. I've tried lots of variations but with no success.

I'd be happy with either:

The annual_div being the sum of the div_amt where exchange, ticker and year match, and that annual_div figure be replicated for each of the contributing rows.
Create a new dataframe which keeps all columns except the div_amt and div_ex_date so that there is just one row per ticker per year.

edited Feb 01 '18 at 13:25

A_emperio

asked Feb 01 '18 at 12:36

user3709511

I think you need `ZACKS_DH_df['annual_div'] = ZACKS_DH_df.groupby(['exchange', 'ticker' , 'year'])['div_amt'].transform(sum)`, it is dupe :( – jezrael Feb 01 '18 at 12:38
Thank you so much, this works. While similar questions had been asked before, I had not seen an answer with 'transform' in it. That got it going. Thanks again. – user3709511 Feb 01 '18 at 15:52
You are welcome! Nice day! – jezrael Feb 01 '18 at 15:52

Python: Conditional sum of rows in dataframe

0 Answers0