0

My data has two columns - first is TxnAmount, second is TxnDate ( which has date & time in string format)

 **TxnAmount**      **TxnDate**

 100                '2018-04-01 19:12:26'

 135                '2018-04-01 17:03:50'

 210                '2018-04-01 14:37:17'

 70                 '2018-04-02 12:24:49'

 250                '2018-04-02 09:56:18'

Here is my code, which is able to sum ( day wise)

import pandas as pd
import numpy as np
df = pd.read_csv('File.csv')
df["TxnDate"] = pd.to_datetime(df["TxnDate"])
df.set_index(df["TxnDate"],inplace=True)
df['TxnAmount'].resample('D', how='sum')

But, I need the sum to be from 6:00 AM to 6:00 AM interval. The code I assume by default sums according to stand 0:00 AM to 0:00 interval.

Don Corleone
  • 341
  • 2
  • 3
  • 14

1 Answers1

1

Use the base argument in resample:

df.set_index("TxnDate").resample(rule='24H', closed='left', label='left', base=6).sum()

                     TxnAmount
TxnDate                       
2018-04-01 06:00:00        445
2018-04-02 06:00:00        320  

Data:

df = {'TxnAmount': {0: 100, 1: 135, 2: 210, 3: 70, 4: 250},
      'TxnDate': {0: Timestamp('2018-04-01 19:12:26'),
                  1: Timestamp('2018-04-01 17:03:50'),
                  2: Timestamp('2018-04-01 14:37:17'),
                  3: Timestamp('2018-04-02 12:24:49'),
                  4: Timestamp('2018-04-02 09:56:18')}}
andrew_reece
  • 20,390
  • 3
  • 33
  • 58