I have just started my first project in python. I have a pandas dataframe of cryptocurrency data with four columns "id", "date", "close", and "volume".
I want to calculate the On Balance Volume (OBV) for each group (id) in my dataframe. My goal is to have a new column OBV which contains the OBV for each group.The formuala for OBV is as follows:
If today's close > yesterday's close then: OBV = Yesterday’s OBV + Today’s Volume; If today’s close < than yesterday’s close then: OBV = Yesterday’s OBV – Today’s Volume; If today’s close = to yesterday’s close then: OBV = Yesterday’s OBV
On Balance Volume is calculated by adding the day's volume to a cumulative total when the security's price closes up, and subtracting the day's volume when the security's price closes down.
My dataframe df looks like this:
|id |last_updated |close |volume
|-- |------------ |----- |------
|1 |2021-10-08 |0.0113 |134.431757
|1 |2021-10-09 |0.0115 |57.222013
|1 |2021-10-10 |0.0115 |75.848861
|1 |2021-10-11 |0.0120 |77.568503
|1 |2021-10-12 |0.0118 |19.712469
|2 |2021-10-08 |0.1104 | 0.000000
|2 |2021-10-09 |0.1632 | 0.176570
|2 |2021-10-10 |0.1603 | 7.636593
|2 |2021-10-11 |0.1509 |193.415714
|2 |2021-10-12 |0.1046 | 0.000000
Since I have a large dateset of cryptocurrency data, I want to run vectorized computation in order to to express batch operations on data without writing any for loops:
df['OBV']=np.where(df['close'] > df['close'].shift(1), df['volume'], np.where(df['close'] < df['close'].shift(1), -df['volume'], 0)).cumsum()
With this approach, however, I only obtain the correct calculation of OBV for id=1. The calculation of the OBV for id=2 uses the last value of OBV from id=1 instead of restarting excecuting the calculations from the first row of id=2.
I have also tried the OBV calculation with a loop but I also do not know how to iterate with the loop through each group, that is to excecute the loop within each group.
OBV=[]
OBV.append(0)
#Loop through the data set (close price)
for i in range(1, len(df.close)):
if df.close[i] > df.close[i-1]:
OBV.append(OBV[-1] + df.vol[i])
elif df.close[i] < df.close[i-1]:
OBV.append(OBV[-1] - df.vol[i])
else:
OBV.append(OBV[-1])
df['OBV']=OBV
So how do I obtain a new colum OBV in my dataframe df which contains the OBV within each group?
Any help is very much appreciated. Thank you!