I have a data set which contains samples at the 1 second level from workout data (heart rate, watts, etc.) The data feed is not perfect and sometimes there are gaps. I need to have the dataset at 1 sec intervals with no missing rows.
Once I resample the data it looks along the lines of this:
activity_id watts
t
1 12345 5
2 12345 NaN
3 12345 15
6 98765 NaN
7 98765 10
8 98765 12
After the resample I cant get the interpolate to work properly. The problem is that the interpolation is going across the entire dataframe and I need it to 'reset' for every workout ID within the dataframe. The data should look like this after its working properly:
activity_id watts
t
1 12345 5
2 12345 10
3 12345 15
6 98765 NaN
7 98765 10
8 98765 12
Heres the snippet of code I have tried. It's not throwing any errors but also not doing the interpolation...
seconds = 1
df = df.groupby(['activity_id']).resample(str(seconds) + 'S').mean().reset_index(level='activity_id', drop=True)
df = df.reset_index(drop=False)
df = df.groupby('activity_id').apply(lambda group: group.interpolate(method='linear'))
Marked as correct answer here but not working for me: Pandas interpolate within a groupby