我再次比较了岭回归的手动计算lm.ridge
功能,但是,这两种技术的答案似乎不匹配。他们只匹配时.
set.seed(1)
x <- rnorm(1000,1,2)
x <- matrix(x,ncol=10,nrow=100)
y <- rnorm(100,2,5)
xs <- scale(x,T,T)
ys <- scale(y,T,T)
p <- dim(x)[2]
lam <- 2
# manual Calculation
bh <- solve(t(xs) %*% xs + lam * diag(p), t(xs) %*% ys)
# lm..ridge
fit <- lm.ridge(ys~xs-1, lambda=lam)
coef_fit <- as.matrix(coef(fit),nco1)
cbind(bh, coef_fit)
有谁知道为什么估计的系数只匹配但不适用于其他值年代?
更新: 抱歉之前没有缩放数据。我现在已经对数据进行了缩放,但估计的系数之间仍然存在差异。
> cbind(bh, coef_fit)
[,1] [,2]
xs1 -0.144767582 -0.144799855
xs2 -0.114627840 -0.114652989
xs3 -0.019612430 -0.019612567
xs4 0.007292303 0.007293982
xs5 0.044335298 0.044354816
xs6 -0.034135483 -0.034137483
xs7 0.020260806 0.020265217
xs8 0.058511001 0.058520197
xs9 -0.124643955 -0.124671909
xs10 0.060076729 0.060097567