Power spectrum of real data with fftpack on log axis

Question

I already read many discussion about this topic (comparison between lomb-scargle and fft , Plotting power spectrum in python, Scipy/Numpy FFT Frequency Analysis, and many others), but still can't manage it, so I need some tips. I have a list of photon events (detections vs time), the data are available here. The columns are time, counts , errors, and counts in different energy bands (you can ignore them). I know the source has a periodicity around 8.9 days = 1.3*10^-6 Hz. I would like to plot the Power spectrum density showing a peak at this frequency (on a log x-axis, possibly). It would also be nice if I can avoid the half part of the plot (symmetric). This is my code till now, not so far but still something:

import numpy as np
from scipy.fftpack import fft, rfft, fftfreq
import pylab as plt

x,y = np.loadtxt('datafile.txt', usecols = (0,1), unpack=True)
y = y - y.mean() # Removes the large value at the 0 frequency that we don't care about

f_range = np.linspace(10**(-7), 10**(-5), 1000)
W = fftfreq(y.size, d=x[1]-x[0])

plt.subplot(2,1,1)
plt.plot(x,y)
plt.xlabel('Time (days)')

f_signal = fft(y)
plt.subplot(2,1,2)
plt.plot(W, abs(f_signal))
plt.xlabel('Frequency (Hz)')

Here the (useless) plot produced: DFT

Two issues: 1) I think `fftfreq` only works with `fft`, not `rfft` (which has a different frequency axis); 2) `rfft` gives a complex result so plot `abs(f_signal)`. — tom10, Feb 04 '14 at 03:28
Thanks tom. You mean I should use `f_signal = fft(y)` instead of `f_signal = rfft(y)` ? — Py-ser, Feb 04 '14 at 03:42
That will work. Either, `fft(y)` or `rfft(y)` should work since your signal is real, but since `fft` has the correspondence to `fftfreq`, `fft` is probably (counter-intuitively) the easiest way to get the frequency axis right. (Also, with either, you have to get the magnitude also, ie, use `abs`, since it's not uncommon just plotting real component that the result looks choppy like you show since power is moving between the R and I components.) — tom10, Feb 04 '14 at 03:50
Thanks again! I will edit and update the original post in a while. — Py-ser, Feb 04 '14 at 03:55
I think I'm looking at the edits. Can you put units on the x-axes? It's not uncommon that the interesting thing is really close to f=0... could it be the highest peak that you're looking for? — tom10, Feb 04 '14 at 04:12
Edited again. What I am trying to reproduce in the spectrum is the frequency peak for the given periodicity, and to show the plot in a more defined way (logarithmic, narrower range, etc.). — Py-ser, Feb 04 '14 at 04:58
Are you sure that the bottom unit is in `Hz`? It's more likely `1/days` since days is the unit of your data. Then the tallest peak could reasonably be at 9 days. That is, I think you basically have what you want, just massage it a bit. So: 0) make sure the units are correct; 1) zoom in on the x-axis to see the interesting part better; 2) maybe use rfft; 3) take the log if you want. — tom10, Feb 04 '14 at 06:37
The power spectrum is actually the square of the DFT, not its modulo. It helps making peaks more dramatic. — Jaime, Feb 04 '14 at 08:01
Jaime, I think you just answered one of my questions. The answer below should be fine, in part. — Py-ser, Feb 04 '14 at 08:06

Py-ser · Accepted Answer · 2014-02-04T08:25:56.143

Here is an improved version of the code above:

import pyfits
import numpy as np
from scipy.fftpack import fft, rfft, fftfreq
import pylab as plt

x,y = np.loadtxt('data.txt', usecols = (0,1), unpack=True)
y = y - y.mean()

W = fftfreq(y.size, d=(x[1]-x[0])*86400)
plt.subplot(2,1,1)
plt.plot(x,y)
plt.xlabel('Time (days)')

f_signal = fft(y)
plt.subplot(2,1,2)
plt.plot(W, abs(f_signal)**2)
plt.xlabel('Frequency (Hz)')

plt.xscale('log')
plt.xlim(10**(-6), 10**(-5))
plt.show()

And here the plot produced (correctly): fft The highest peak is the peak I was trying to reproduce. The second peak is also expected, but with less power (as it is, indeed). If rfft is used instead of fft (and rfftfreq instead of fftfreq) the same plot is reproduced (in that case, the frequencies values, instead of the module, can be used numpy.fft.rfft)

I don't want to block the topic, so I will ask here: And how can I retrieve the frequencies of the peaks? Would be great to plot the frequencies by side the peaks.

Peak-finding maybe via [find_peaks_cwt](http://docs.scipy.org/doc/scipy/reference/generated/scipy.signal.find_peaks_cwt.html). There are a lot of other approaches, however, which may work better depending on the situation. You could also just ask a new question on here on that topic. There are probably also existing questions like that asked previously here. — pv., Feb 04 '14 at 09:22
To put info about the peak in your graph: In practice, unless you're doing this with many sets of data, just zoom in and read the times and values of the peaks (or whatever you're interested in) from the plot, and then use `plt.annotate` or `plt.txt` to put the numbers in the figures. — tom10, Feb 05 '14 at 00:37

Power spectrum of real data with fftpack on log axis

1 Answers1