I have been trying to solve an exercise for some time and I haven’t been able to do it, I have a dataset containing a list of calls with the topic of the call (in this sample dataset I decided to use ice cream flavors as topics), In the call center they consider that a topic was solved in the first time if the topic was no mentioned in another call using a time window of 72 hours. I need to create a new column in the data frame that counts the number of times the Ice cream flavor in the row was mentioned within a window of 72 hours (count the number of occurrences of an event within a time window).
I saw a solution using get_dummies but this would be inefficient for me since I have more than 300 Ice cream flavors:
pandas `value_counts` on a rolling time window
The following is a sample of the data I have:
2014-01-01 07:21:51 Apple
2014-01-01 10:00:47 Orange
2014-01-01 13:24:58 Banana
2014-01-01 15:05:22 Strawberry
2014-01-01 23:26:55 Lemon
2014-01-02 10:07:15 Orange
2014-01-02 10:57:23 Banana
2014-01-03 06:32:11 Peach
2014-01-03 11:29:02 Orange
2014-01-03 19:07:37 Coconut
2014-01-03 19:39:53 Mango
2014-01-04 00:02:36 Grape
2014-01-04 06:51:53 Cherry
2014-01-04 07:53:01 Strawberry
2014-01-04 08:57:48 Coconut
And this is the expected result:
2014-01-01 07:21:51 Apple 1
2014-01-01 10:00:47 Orange 1
2014-01-01 13:24:58 Banana 1
2014-01-01 15:05:22 Strawberry 1
2014-01-01 23:26:55 Lemon 1
2014-01-02 10:07:15 Orange 2
2014-01-02 10:57:23 Banana 2
2014-01-03 06:32:11 Peach 1
2014-01-03 11:29:02 Orange 3
2014-01-03 19:07:37 Coconut 1
2014-01-03 19:39:53 Mango 1
2014-01-04 00:02:36 Grape 1
2014-01-04 06:51:53 Cherry 1
2014-01-04 07:53:01 Strawberry 2
2014-01-04 08:57:48 Coconut 2
I have found some similar questions, but not quite solving my need:
group by time and other column in pandas
Rolling count pandas for categorical variables using time
In pandas how to calculate 'Countif' on a moving window basis?