1

Is it possible to iterate over each group after using Grouper in a dask groupby. I've tried

import dask.dataframe as dd
import pandas as pd
pdf = pd.DataFrame({'A':[1, 2, 3, 4, 5], 'B':['1985','1985','1990','1990','1990']})
pdf['B']=pd.to_datetime(pdf['B'], format="%Y")
ddf = dd.from_pandas(pdf, npartitions = 3)
groups = ddf.groupby(pd.Grouper(key='B', freq="Y"))
for group in ddf['B'].unique().compute():
    print(groups.get_group(pd.Timestamp(group))['A'].mean().compute())

But get an error:

TypeError: object of type 'TimeGrouper' has no len()

This is similar to the question iterate over GroupBy object in dask but with Grouper.

Arsen Khachaturyan
  • 7,904
  • 4
  • 42
  • 42
victoria55
  • 225
  • 2
  • 6

0 Answers0