机器算法验证 - 为什么几乎每次调用 R rlm {MASS} 都会返回不同的系数？ - 吾爱随笔录

为什么几乎每次调用 R rlm {MASS} 都会返回不同的系数？

机器算法验证 r 回归回归系数强大的

2022-03-23 16:47:46

我注意到rlm {MASS}几乎每次都返回不同的系数，即使我使用相同的参数和相同的数据集

我打电话给：

model <- rlm(price ~ ., data = data[,-1], weights = weights,
             maxit = 1000000, init = "lts", method = "MM",
             psi=psi.huber, acc=0.00001, scale.est="proposal 2", cor = T)

结果：

Intercept)  livingArea        area    discrete       dummy 
 -570.621795   17.169323    2.275109   46.002527  143.812900 

(Intercept)  livingArea        area    discrete       dummy 
 -581.893552   16.828956    3.912192   48.253955  180.875439 

(Intercept)  livingArea        area    discrete       dummy 
 -303.488284   16.747009    2.928579   26.951809  -14.795652

这是三个最常见的结果。

有人可以解释一下为什么 rlm 会这样吗？

注意：我是统计领域的初学者。

1个回答

由于您选择计算初始值 ( init) 的算法（算法的第二阶段，即 MM 步骤，由此开始）。

设置init=lts使用 FastLTS 算法拟合的系数作为 MM 迭代的起点。FastLTS 算法又使用许多随机起点，因此本身也是随机的。除非您修复种子参数，否则您将获得不同的解决方案（这是应该的！）。

library(MASS)
model1 <- rlm(stack.loss ~ ., data = stackloss,maxit = 1000000, init = "lts",seed=1, method = "MM", psi=psi.huber, acc=0.00001, scale.est="proposal 2", cor = T)
model2 <- rlm(stack.loss ~ ., data = stackloss,maxit = 1000000, init = "lts",seed=1, method = "MM", psi=psi.huber, acc=0.00001, scale.est="proposal 2", cor = T)
model1$coef-model2$coef

Setting method = "MM"without settinginit使用 FastS 算法拟合的系数作为 MM 迭代的起点（来源）。FastS 算法又使用许多随机起点，因此本身也是随机的。除非您修复种子参数，否则您将获得不同的解决方案（这是应该的！）。

library(MASS)
model1 <- rlm(stack.loss ~ ., data = stackloss,maxit = 1000000, seed=1, method = "MM", psi=psi.huber, acc=0.00001, scale.est="proposal 2", cor = T)
model2 <- rlm(stack.loss ~ ., data = stackloss,maxit = 1000000, seed=1, method = "MM", psi=psi.huber, acc=0.00001, scale.est="proposal 2", cor = T)
model1$coef-model2$coef

其它你可能感兴趣的问题

上一篇为什么统计学家证明渐近正态性？下一篇在多重比较中检验显着超出显着 p 值