我试图了解在时间序列中计算 ACF 值背后的“机制”。作为“向自己证明自己的练习” [注意:我更新了此链接中的代码以反映已接受答案中的信息],我并不关注代码的优雅性,而是故意使用循环。问题是,虽然我得到的值与 生成的值接近acf()
,但它们并不相等,而且在某些时候,并不是那么接近。有什么概念上的误解吗?
正如上面链接的在线注释中所记,数据是根据这个在线示例生成的。
set.seed(0) # To reproduce results
x = seq(pi, 10 * pi, 0.1) # Creating time-series as sin function.
y = 0.1 * x + sin(x) + rnorm(x) # Time-series sinusoidal points with noise.
y = ts(y, start = 1800) # Labeling time axis.
model = lm(y ~ I(1801:2083)) # Detrending (0.1 * x)
st.y = y - predict(model) # Final de-trended ts (st.y)
ACF = 0 # Empty vector to capture the auto-correlations.
ACF[1] = cor(st.y, st.y) # The first entry in the ACF is the cor with itself.
for(i in 1:24){ # Took 24 points to parallel the output of `acf()`
lag = st.y[-c(1:i)] # Introducing lags in the stationary ts.
clipped.y = st.y[1:length(lag)] # Compensating by reducing length of ts.
ACF[i + 1] = cor(clipped.y, lag) # Storing each correlation.
}
w = acf(st.y) # Extracting values of acf without plotting.
all.equal(ACF, as.vector(w$acf)) # Checking for equality in values.
# Pretty close to manual calc: Mean relative difference: 0.03611463"
为了对相对输出有一个切实的了解,这是手动计算的自相关的样子:
1.0000000 0.3195564 0.3345448 0.2877745 0.2783382 0.2949996 ...
... -0.1262182 -0.1795683 -0.1941921 -0.1352814 -0.2153883 -0.3423663
与 Racf()
函数相反:
1.0000000 0.3187104 0.3329545 0.2857004 0.2745302 0.2907426 ...
... -0.1144625 -0.1621018 -0.1737770 -0.1203673 -0.1924761 -0.3069342
正如所建议的那样,这很可能与以下行中的调用中的代码相比得到了解释acf()
:
acf <- .Call(C_acf, x, lag.max, type == "correlation")
如何C_acf
访问此功能?
我与 PACF 有类似的问题,我认为这是相关的。PACF 值的代码在这里。[注意:在这种情况下,我怀疑它实际上是一个四舍五入的差异]。