如何确定时间序列数据中存在的多个周期性?

发布于 2025-01-17 00:12:00 字数 2305 浏览 4 评论 0原文

我的目标是检测时间序列波形中存在的各种季节性及其时间段。

我目前正在使用以下数据集: https://www.kaggle.com/rakannimer/air-passengers

目前,我尝试了以下方法:

1)使用 FFT:

import pandas as pd
import numpy as np
from statsmodels.tsa.seasonal import seasonal_decompose
 
#https://www.kaggle.com/rakannimer/air-passengers
df=pd.read_csv('AirPassengers.csv')
 
df.head()

frequency_eval_max = 100
A_signal_rfft = scipy.fft.rfft(df['#Passengers'], n=frequency_eval_max)
n = np.shape(A_signal_rfft)[0] # np.size(t)
frequencies_rel = len(A_signal_fft)/frequency_eval_max * np.linspace(0,1,int(n))

fig=plt.figure(3, figsize=(15,6))
plt.clf()
plt.plot(frequencies_rel, np.abs(A_signal_rfft), lw=1.0, c='paleturquoise')
plt.stem(frequencies_rel, np.abs(A_signal_rfft))
plt.xlabel("frequency")
plt.ylabel("amplitude")

这会产生以下绘图: 输入图片此处的描述

但它不会产生任何结论性或可理解的结果。

理想情况下,我希望看到代表每日、每周、每月和每年季节性的峰值。

有人能指出我做错了什么吗?

2) 自相关:

from pandas.plotting import autocorrelation_plot
plt.rcParams.update({'figure.figsize':(10,6), 'figure.dpi':120})
autocorrelation_plot(df['#Passengers'].tolist())

完成后,我得到如下图: 输入图片这里的描述

但是我如何阅读这个图以及如何从中得出各种季节性及其周期的存在?

3)SLT 分解算法

df.set_index('Month',inplace=True)
df.index=pd.to_datetime(df.index)
#drop null values
df.dropna(inplace=True)
df.plot()

result=seasonal_decompose(df['#Passengers'], model='multiplicable', period=12)

result.seasonal.plot()

给出以下图: 输入图片这里的描述

但是在这里我只能看到一种季节性。

那么我们如何使用这种方法检测所有类型的季节性及其存在的时间段呢?


因此,我尝试了 3 种不同的方法,但它们看起来要么是错误的,要么是不完整的。

有人可以帮助我找到最有效的方法(即使除了我尝试过的方法之外)检测任何给定时间序列数据的各种季节性及其时间段?

My objective is to detect all kinds of seasonalities and their time periods that are present in a timeseries waveform.

I'm currently using the following dataset:
https://www.kaggle.com/rakannimer/air-passengers

At the moment, I've tried the following approaches:

1) Use of FFT:

import pandas as pd
import numpy as np
from statsmodels.tsa.seasonal import seasonal_decompose
 
#https://www.kaggle.com/rakannimer/air-passengers
df=pd.read_csv('AirPassengers.csv')
 
df.head()

frequency_eval_max = 100
A_signal_rfft = scipy.fft.rfft(df['#Passengers'], n=frequency_eval_max)
n = np.shape(A_signal_rfft)[0] # np.size(t)
frequencies_rel = len(A_signal_fft)/frequency_eval_max * np.linspace(0,1,int(n))

fig=plt.figure(3, figsize=(15,6))
plt.clf()
plt.plot(frequencies_rel, np.abs(A_signal_rfft), lw=1.0, c='paleturquoise')
plt.stem(frequencies_rel, np.abs(A_signal_rfft))
plt.xlabel("frequency")
plt.ylabel("amplitude")

This results in the following plot:
enter image description here

But it doesn't result in anything conclusive or comprehensible.

Ideally I wish to see the peaks representing daily, weekly, monthly and yearly seasonality.

Could anyone point out what am I doing wrong?

2) Autocorrelation:

from pandas.plotting import autocorrelation_plot
plt.rcParams.update({'figure.figsize':(10,6), 'figure.dpi':120})
autocorrelation_plot(df['#Passengers'].tolist())

After doing which I get a plot like the following:
enter image description here

But how do I read this plot and how can I derive the presence of the various seasonalities and their periods from this?

3) SLT Decomposition Algorithm

df.set_index('Month',inplace=True)
df.index=pd.to_datetime(df.index)
#drop null values
df.dropna(inplace=True)
df.plot()

result=seasonal_decompose(df['#Passengers'], model='multiplicable', period=12)

result.seasonal.plot()

This gives the following plot:
enter image description here

But here I can only see one kind of seasonality.

So how do we detect all the types of seasonalities and their time periods that are present using this method?


Hence, I've tried 3 different approaches but they seem either erroneous or incomplete.

Could anyone please help me out with the most effective approach (even apart from the ones I've tried) to detect all kinds of seasonalities and their time periods for any given timeseries data?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

牵强ㄟ 2025-01-24 00:12:00

我仍然认为傅里叶分析是可行的方法,只是 0 频率结果掩盖了任何见解。

这本质上是数据集平均值的平方,并且所有记录都是正数,与您使用傅立叶变换分析的典型正弦函数相去甚远。因此,只需在进行 FFT 之前将数据集的平均值减去数据集,然后看看结果如何。这也有助于自相关技术。

另外,您必须为频率值指定单位。不要满足于 FFT 的原始值。这些与数据集的采样频率和跨度有关。对此进行推理,并在图表中充分标记每日、每周、每月和每年的频率。

I still think a Fourier analysis is the way to go, its just that the 0-frequency result is shadowing any insight.

This is essentially the square of the average of your data set, and all records are positive, far from the typical sinusoidal function you would analyze with Fourier Transforms. So simply subtract the average of your dataset to your dataset before doing the FFT and see how it looks. This would also help with the autocorrelation technique.

Also, you MUST give units to your frequency values. Do not settle for the raw values from the FFT. Those are related to the sampling frequency and span of your dataset. Reason about it and adequately label the daily, weekly, monthly and anual frequencies in your chart.

独孤求败 2025-01-24 00:12:00

使用FFT,可以得到基频。然后您可以使用低通滤波器或手动选择前 n 个频率。这些频率将对应于“季节性”。将滤波后的 FFT 转换到时域,您可以可视化最基本的基础重复,您可以轻松计算这些重复的时间周期,并通过在时域中单独绘制 F0、F1...来可视化它。

using FFT, you can get the fundamental frequency. you can then use a low-pass filter or just manually select the first n frequencies. these frequencies will correspond to the 'seasonalities'. transform your filtered FFT into time domain and you can visualize the most basic underlying repetitions, you can easily calculate the time period of those repetitions and visualize it by individually plotting the F0,F1,... in time domain.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文