机器算法验证 - 如何在 R 中使用 lm() 函数存储标准错误？ - 吾爱随笔录

如何在 R 中使用 lm() 函数存储标准错误？

机器算法验证 r 线性模型

2022-03-15 21:46:55

一切都在标题中...我知道如何存储估计值，但我不知道如何存储它们的标准错误...谢谢

> x <- runif(100)
> y <- 5 + 3 * x + rnorm(100, 0, 0.15)
> reg <- lm(y~x)
> 
> summary(reg)

Call:
lm(formula = y ~ x)

Residuals:
     Min       1Q   Median       3Q      Max 
-0.32198 -0.12626  0.02584  0.10873  0.31411 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  5.00931    0.03087  162.25   <2e-16 ***
x            2.98162    0.05359   55.64   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

Residual standard error: 0.1499 on 98 degrees of freedom
Multiple R-squared: 0.9693,     Adjusted R-squared: 0.969 
F-statistic:  3096 on 1 and 98 DF,  p-value: < 2.2e-16 

> 
> reg$coef
(Intercept)           x 
   5.009309    2.981617

3个回答

检查 summary(reg) 返回的对象。你会发现

> str(summary(reg)$coef)
...
> X <- summary(reg)$coef
> X[,2]
(Intercept)           x 
 0.03325738  0.05558073

给你你想要的。或者，如果您自己计算它们（如评论中的@caracal 所示）：

sqrt(diag(vcov(reg)))

Doug Bates 曾经在某处提到访问器函数更可取，所以我会这样做

R> example(lm)   ## to create lm.D9 object
[...]
R> coef(summary(lm.D9))
            Estimate Std. Error  t value    Pr(>|t|)
(Intercept)    5.032   0.220218 22.85012 9.54713e-15
groupTrt      -0.371   0.311435 -1.19126 2.49023e-01
R> str(coef(summary(lm.D9)))
 num [1:2, 1:4] 5.032 -0.371 0.22 0.311 22.85 ...
 - attr(*, "dimnames")=List of 2
  ..$ : chr [1:2] "(Intercept)" "groupTrt"
  ..$ : chr [1:4] "Estimate" "Std. Error" "t value" "Pr(>|t|)"
R> coef(summary(lm.D9))[,"Std. Error"]
(Intercept)    groupTrt 
   0.220218    0.311435 
R>

关键是coef()摘要对象的访问器。

按照@Joris Meys 回答如何计算标准。手动出错。

#Std. Error = residual variance / variable variance =
sqrt(diag(vcov(reg)))

#where vcov(reg) = 
summary(reg)$cov.unscaled * summary(reg)$sigma^2

#where summary(reg)$cov.unscaled = 1/(variable variance) (diagonal of precision matrix) =
solve(t(x) %*% x)

#where summary(reg)$sigma = residual variance =
sqrt(sum(reg$residuals^2) / (nrow(x)-ncol(x)))

> x <- matrix(runif(200),nrow=100)
> y <- 5 + 3 * x[,1] + rnorm(100, 0, 0.15)
> reg <- lm(y~x)
> summary(reg)$coefficient[,'Std. Error']
(Intercept)          x1          x2 
 0.03842706  0.05494507  0.05243990 
> sqrt(diag(vcov(reg)))
(Intercept)          x1          x2 
 0.03842706  0.05494507  0.05243990 
> sqrt(diag( summary(reg)$sigma^2*summary(reg)$cov.unscaled ))
(Intercept)          x1          x2 
 0.03842706  0.05494507  0.05243990 
> x_ = cbind(rep(1,nrow(x)),x)
> sqrt(diag( sum(reg$residuals^2)/(nrow(x)-ncol(x)-1) * solve(t(x_) %*% x_) ))
[1] 0.03842706 0.05494507 0.05243990

有关详细的数学，请检查普通最小二乘中参数的标准误差或如何推导线性回归中系数的方差-协方差矩阵。

其它你可能感兴趣的问题

上一篇ANOVA 中的显着截距是什么意思？下一篇识别和删除异常值是否合适，因为它们会导致问题？