机器算法验证 - R中的ARMA建模 - 吾爱随笔录

R中的ARMA建模

机器算法验证 r 时间序列

2022-03-29 02:18:56

我有一个时间序列。我想使用 ARMA 对其进行建模，它将用于预测。

在 RI 中使用arima()函数来获取系数。但arima()需要 order(p,d,q) 作为输入。在 R 中，为 p 和 q（d = 0）得出一个好的值，这样我就不会过度拟合，最简单的方法是什么？

2个回答

和值的最简单方法是使用包forecast中的函数。在任何统计数据包中都没有最简单的方法可以得出好的值。主要原因是没有普遍的好定义。 $p$ $q$ auto.arima

由于您提到过拟合，一种可能的方法是为和的不同值拟合 arima 模型，然后根据您的过拟合标准（例如，样本预测性能）选择最佳模型。基本相同，您可以在 AIC、AICC 和 BIC 之间进行选择，以选择最佳模型。 $p$ $q$ auto.arimaauto.arima

一种选择是使用以下组合来拟合一系列 ARMA 模型 $p$ 和 $q$ 并使用最“适合”的模型。在这里，我使用 BIC 评估“拟合”以尝试惩罚过于复杂的拟合。下面显示了内置 Mauna Loa 的示例 $\mathrm{CO}_2$ 浓度数据集

## load the data
data(co2)
## take only data up to end of 1990 - predict for remaining data later
CO2 <- window(co2, end = c(1990, 12))

## Set up the parameter sets over which we want to operate
CO2.pars <- expand.grid(ar = 0:2, diff = 1, ma = 0:2, sar = 0:1,
                        sdiff = 1, sma = 0:1)
## As you are only wanting ARMA, then you would need something like
## pars <- expand.grid(ar = 0:4, diff = 0, ma = 0:4)
## and where you choose the upper and lower limits - here 0 and 4

## A vector to hold the BIC values for each combination of model
CO2.bic <- rep(0, nrow(CO2.pars))

## loop over the combinations, fitting an ARIMA model and recording the BIC
## for that model. Note we use AIC() with extra penalty given by `k`
for (i in seq(along = CO2.bic)) {
    CO2.bic[i] <- AIC(arima(CO2, unlist(CO2.pars[i, 1:3]), 
                            unlist(CO2.pars[i, 4:6])),
                      k = log(length(CO2)))
}

## identify the model with lowest BIC
CO2.pars[which.min(CO2.bic), ]

## Refit the model with lowest BIC
CO2.mod <- arima(CO2, order = c(0, 1, 1), seasonal = c(0, 1, 1))
CO2.mod
## Diagnostics plots
tsdiag(CO2.mod, gof.lag = 36)

## predict for the most recent data
pred <- predict(CO2.mod, n.ahead = 7 * 12)
upr <- pred$pred + (2 * pred$se) ## upper and lower confidence intervals
lwr <- pred$pred - (2 * pred$se) ## approximate 95% pointwise

## plot what we have done
ylim <- range(co2, upr, lwr)
plot(co2, ylab = ylab, main = expression(bold(Mauna ~ Loa ~ CO[2])),
     xlab = "Year", ylim = ylim)
lines(pred$pred, col = "red")
lines(upr, col = "red", lty = 2)
lines(lwr, col = "red", lty = 2)
legend("topleft", legend = c("Observed", "Predicted", "95% CI"),
       col = c("black", "red", "red"), lty = c(1, 1, 2), bty = "n")

其它你可能感兴趣的问题

上一篇如何测试模型的预测能力？下一篇将任意分布转换为均匀分布