我不确定这个问题是否适合交叉验证,但我不确定在哪里发布它。
我已经使用mgcv包构建了一个简单的模型。
a <- gam(x ~ s(y), method="REML", data=dat100k)
但是,当我运行时gam.check(a),每次gam.check(a)使用相同的模型运行时都会得到不一致的输出。在第一次调用中 k 被认为是好的gam.check(a)(尽管 k 接近 edf 并且 p 不是很大),并且在第二次调用中根据 p 值认为 k 太低了。这是正常的吗?
这是运行 gam.check(a) 两次的输出:
gam.check(a)
Method: REML Optimizer: outer newton
full convergence after 8 iterations.
Gradient range [-0.04365533,0.04277738]
(score 407913 & scale 214.1436).
Hessian positive definite, eigenvalue range [4.001815,49713.04].
Model rank = 10 / 10
Basis dimension (k) checking results. Low p-value (k-index<1) may
indicate that k is too low, especially if edf is close to k'.
k' edf k-index p-value
s(y) 9.00 8.98 1 0.66
> gam.check(a)
Method: REML Optimizer: outer newton
full convergence after 8 iterations.
Gradient range [-0.04365533,0.04277738]
(score 407913 & scale 214.1436).
Hessian positive definite, eigenvalue range [4.001815,49713.04].
Model rank = 10 / 10
Basis dimension (k) checking results. Low p-value (k-index<1) may
indicate that k is too low, especially if edf is close to k'.
k' edf k-index p-value
s(y) 9.00 8.98 0.98 0.04 *
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1