I am dealing with a dataset where observations occur between opening and closing hours -- but the service closes on the day after it opens. For example, opening occurs at 7am and closing at 1am, the following day.
This feels like a very common problem -- I've searched around for it and am open to the fact I might just not know the correct terms to search for.
For most of my uses it's enough to do something like:
open_close = pd.DatetimeIndex(start='2012-01-01 05:00:00', periods = 15, offset='D')
Then I can just do fun little groupbys on the df:
df.groupby(open_close.asof).agg(func).
But I've run into an instance where I need to grab multiple of these open-close periods. What I really want to be able to do is just have an DatetimeIndex where I get to pick when an day starts. So I could just redefine 'day' to be from 5AM to 5AM. The nice thing about this is I can then use things like df[df.index.dayofweek == 6]
and get back everything from 5AM on Sunday to 5AM on Monda.
It feels like Periods...or something inside of pandas anticipated this request. Would love help figuring it out.
EDIT:
I've also figured this out via creating another column with the right day
df['shift_day'] = df['datetime'].apply(magicFunctionToFigureOutOpenClose)
-- so this isn't blocking my progress. Just feels like something that could be nicely integrated into the package (or datetime...or somewhere...)