2

I am trying to generate a plot from a Pandas dataframe in Python with Matplotlib. Here is a summary of the dataframe.

import pandas as pd
import datetime
import matplotlib.pyplot as plt

# Summarize data frame.
>>> df.shape
(40, 4)

>>> df.dtypes
ID                         object
relative_time     timedelta64[ns]
value                     float64
relative_value            float64
dtype: object

>>> df.head()
    ID     relative_time  value  relative_value
0  001 -1 days +18:08:04    4.5            -1.0
1  001 -1 days +18:18:03    4.5            -1.0
2  001 -1 days +18:28:03    4.5            -1.0
3  001 -1 days +18:38:04    4.5            -1.0
4  001 -1 days +18:48:03    4.5            -1.0

>>> df.tail()
     ID     relative_time  value  relative_value
35  001 -1 days +23:58:03    5.5             0.0
36  001          00:08:03    5.5             0.0
37  001          00:18:03    5.5             0.0
38  001          00:28:02    5.5             0.0
39  001          00:38:04    5.5             0.0

I am trying to plot relative_time on the x-axis and relative_value on the y-axis. However, the code below produces an unexpected result, where I cannot what tell what units the x-axis is in.

# Plot the desired plot.
plt.plot(test['relative_time'], test['relative_value'], marker='.')

enter image description here

Note, the x-axis in the plot above is not in units of hours (relative to time 0). Such a plot would look like the following.

plt.plot(test['relative_time'] / np.timedelta64(1, 'h'), test['relative_value'], marker='.')

enter image description here

How can I plot the x-axis so that it displays time in the same format as the relative_time column? For example, if the x-axis were to have tick marks every hour, they would be labeled as, -1 days +18:00:00, -1 days +19:00:00, ..., 00:00:00, and 01:00:00.

Adam
  • 997
  • 2
  • 12
  • 21

1 Answers1

2

The units of your x-axis are nanoseconds, as shown in your output

>>> df.dtypes
ID                         object
relative_time     timedelta64[ns]  <----- [ns] == nanoseconds
value                     float64
relative_value            float64
dtype: object

Looks like matplotlib just displays nanoseconds, so you need to format those nanoseconds to a string format. Unfortunately, the functionality around the numpy.timedelta64 data type is limited and I couldn't find anything in the numpy documentation that could do that.

Example Formatted x-axis labels

Source: matplotlib intelligent axis labels for timedelta

import datetime
import numpy as np
import matplotlib.pyplot as plt
import matplotlib

fig = plt.figure()
ax = fig.add_subplot(111)

Create an array of np.timedelta64[ns] values. This is what you would get if you did df["relative_time"].values.

# create list of times
x = [np.timedelta64(k, "ns") for k in range(0,300*10**9,10**9)]

# create some random y-axis data
y = np.random.random(len(x))

ax.plot(x, y)

# Function that formats the axis labels
def timeTicks(x, pos):
    seconds = x / 10**9 # convert nanoseconds to seconds
    # create datetime object because its string representation is alright
    d = datetime.timedelta(seconds=seconds)
    return str(d)

formatter = matplotlib.ticker.FuncFormatter(timeTicks)
ax.xaxis.set_major_formatter(formatter)
plt.show()
Community
  • 1
  • 1
Filip Kilibarda
  • 2,484
  • 2
  • 20
  • 31