0

I have a data set which contains samples at the 1 second level from workout data (heart rate, watts, etc.) The data feed is not perfect and sometimes there are gaps. I need to have the dataset at 1 sec intervals with no missing rows.

Once I resample the data it looks along the lines of this:

    activity_id watts
t                   
1   12345       5
2   12345       NaN
3   12345       15
6   98765       NaN
7   98765       10
8   98765       12

After the resample I cant get the interpolate to work properly. The problem is that the interpolation is going across the entire dataframe and I need it to 'reset' for every workout ID within the dataframe. The data should look like this after its working properly:

   activity_id watts
t                   
1   12345       5
2   12345       10
3   12345       15
6   98765       NaN
7   98765       10
8   98765       12

Heres the snippet of code I have tried. It's not throwing any errors but also not doing the interpolation...

seconds = 1
df = df.groupby(['activity_id']).resample(str(seconds) + 'S').mean().reset_index(level='activity_id', drop=True)
df = df.reset_index(drop=False)
df = df.groupby('activity_id').apply(lambda group: group.interpolate(method='linear'))

Marked as correct answer here but not working for me: Pandas interpolate within a groupby

Ethanopp
  • 73
  • 1
  • 7
  • ```df['watts'] = df.groupby('activity_id')['watts'].apply(lambda group: group.interpolate(method='linear'))```, your not assigning the interpolate to the column. Notice the 'watts' before apply. – Ben Pap May 24 '19 at 20:17
  • Thanks that worked! – Ethanopp May 31 '19 at 17:52

0 Answers0