我的 ARIMA 预测似乎是对数的。有道理吗?
我有这个df.这是一个包含大量数据的数据集,其中包含日期和温度。:
dades_estacio='Dades_estacio_Das.csv'
dateparse = lambda dates: pd.datetime.strptime(dates, '%Y-%m-%d')
df = pd.read_csv(dades_estacio, parse_dates=['DATA'],date_parser=dateparse)
print ('\n Parsed Data:')
df=df.drop('Unnamed: 0',axis=1)
df.head()
接下来我检查平稳性。 p-value=0.001910736198530445,所以我假设它是平稳的:
from statsmodels.tsa.stattools import adfuller
def ad_test(dataset):
dftest = adfuller(dataset, autolag = 'AIC')
print("1. ADF : ",dftest[0])
print("2. P-Value : ", dftest[1])
print("3. Num Of Lags : ", dftest[2])
print("4. Num Of Observations Used For ADF Regression:", dftest[3])
print("5. Critical Values :")
for key, val in dftest[4].items():
print("\t",key, ": ", val)
ad_test(df['Temp mitja'])
我应用 ARIMA 模型,得到的最佳模型是顺序 (2,0,2):
#Aplicación Metodo ARIMA
from pmdarima import auto_arima
stepwise_fit = auto_arima(df['Temp mitja'], trace=True,
suppress_warnings=True)
我训练模型:
#Split the dataset
print(df.shape)
train=df.iloc[:-30]
test=df.iloc[-30:]
print(train.shape,test.shape)
from statsmodels.tsa.arima.model import ARIMA
model=ARIMA(train['Temp mitja'],order=(2,0,2))
model=model.fit()
model.summary()
现在我用模型预测测试:
start=len(train)
end=len(train)+len(test)-1
pred=model.predict(start=start,end=end,typ='levels')
test['ARIMA Predictions']=pred
#Representación gráfica:
fig, ax = plt.subplots(figsize=(15,6))
ax.plot(test['DATA'], test['Temp mitja'], color = 'tab:orange',label='Temp mitja')
ax.plot(test['DATA'], test['ARIMA Predictions'], color = 'tab:blue',label='ARIMA Predictions')
plt.xlabel('DATA')
plt.ylabel('Temperatura mitja')
ax.legend(loc = 'upper right')
plt.show()
但是我得到 ARIMA 预测是一条看起来对数的曲线。这是正确的吗?或者它应该看起来像测试线一样有很多选择?
I have this df. It is a dataset with a lot of data, in which date and temperatures.:
dades_estacio='Dades_estacio_Das.csv'
dateparse = lambda dates: pd.datetime.strptime(dates, '%Y-%m-%d')
df = pd.read_csv(dades_estacio, parse_dates=['DATA'],date_parser=dateparse)
print ('\n Parsed Data:')
df=df.drop('Unnamed: 0',axis=1)
df.head()
Next I check for stationarity. p-value=0.001910736198530445, so I assume it is stationary:
from statsmodels.tsa.stattools import adfuller
def ad_test(dataset):
dftest = adfuller(dataset, autolag = 'AIC')
print("1. ADF : ",dftest[0])
print("2. P-Value : ", dftest[1])
print("3. Num Of Lags : ", dftest[2])
print("4. Num Of Observations Used For ADF Regression:", dftest[3])
print("5. Critical Values :")
for key, val in dftest[4].items():
print("\t",key, ": ", val)
ad_test(df['Temp mitja'])
I apply the ARIMA model, and get the best model is order (2,0,2):
#Aplicación Metodo ARIMA
from pmdarima import auto_arima
stepwise_fit = auto_arima(df['Temp mitja'], trace=True,
suppress_warnings=True)
I train the model:
#Split the dataset
print(df.shape)
train=df.iloc[:-30]
test=df.iloc[-30:]
print(train.shape,test.shape)
from statsmodels.tsa.arima.model import ARIMA
model=ARIMA(train['Temp mitja'],order=(2,0,2))
model=model.fit()
model.summary()
Now I predict the test with the model:
start=len(train)
end=len(train)+len(test)-1
pred=model.predict(start=start,end=end,typ='levels')
test['ARIMA Predictions']=pred
#Representación gráfica:
fig, ax = plt.subplots(figsize=(15,6))
ax.plot(test['DATA'], test['Temp mitja'], color = 'tab:orange',label='Temp mitja')
ax.plot(test['DATA'], test['ARIMA Predictions'], color = 'tab:blue',label='ARIMA Predictions')
plt.xlabel('DATA')
plt.ylabel('Temperatura mitja')
ax.legend(loc = 'upper right')
plt.show()
But I get the ARIMA Prediction is a curve that seems a logaritm. Is that correct? Or it should seem a line with lots of picks like the test one?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论