如何在 R 中使用 lm() 函数存储标准错误?

机器算法验证 r 线性模型
2022-03-15 21:46:55

一切都在标题中...我知道如何存储估计值,但我不知道如何存储它们的标准错误...谢谢

> x <- runif(100)
> y <- 5 + 3 * x + rnorm(100, 0, 0.15)
> reg <- lm(y~x)
> 
> summary(reg)

Call:
lm(formula = y ~ x)

Residuals:
     Min       1Q   Median       3Q      Max 
-0.32198 -0.12626  0.02584  0.10873  0.31411 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  5.00931    0.03087  162.25   <2e-16 ***
x            2.98162    0.05359   55.64   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

Residual standard error: 0.1499 on 98 degrees of freedom
Multiple R-squared: 0.9693,     Adjusted R-squared: 0.969 
F-statistic:  3096 on 1 and 98 DF,  p-value: < 2.2e-16 

> 
> reg$coef
(Intercept)           x 
   5.009309    2.981617 
3个回答

检查 summary(reg) 返回的对象。你会发现

> str(summary(reg)$coef)
...
> X <- summary(reg)$coef
> X[,2]
(Intercept)           x 
 0.03325738  0.05558073 

给你你想要的。或者,如果您自己计算它们(如评论中的@caracal 所示):

sqrt(diag(vcov(reg)))

Doug Bates 曾经在某处提到访问器函数更可取,所以我会这样做

R> example(lm)   ## to create lm.D9 object
[...]
R> coef(summary(lm.D9))
            Estimate Std. Error  t value    Pr(>|t|)
(Intercept)    5.032   0.220218 22.85012 9.54713e-15
groupTrt      -0.371   0.311435 -1.19126 2.49023e-01
R> str(coef(summary(lm.D9)))
 num [1:2, 1:4] 5.032 -0.371 0.22 0.311 22.85 ...
 - attr(*, "dimnames")=List of 2
  ..$ : chr [1:2] "(Intercept)" "groupTrt"
  ..$ : chr [1:4] "Estimate" "Std. Error" "t value" "Pr(>|t|)"
R> coef(summary(lm.D9))[,"Std. Error"]
(Intercept)    groupTrt 
   0.220218    0.311435 
R> 

关键是coef()摘要对象的访问器。

按照@Joris Meys 回答如何计算标准。手动出错。

#Std. Error = residual variance / variable variance =
sqrt(diag(vcov(reg)))

#where vcov(reg) = 
summary(reg)$cov.unscaled * summary(reg)$sigma^2

#where summary(reg)$cov.unscaled = 1/(variable variance) (diagonal of precision matrix) =
solve(t(x) %*% x)

#where summary(reg)$sigma = residual variance =
sqrt(sum(reg$residuals^2) / (nrow(x)-ncol(x)))

> x <- matrix(runif(200),nrow=100)
> y <- 5 + 3 * x[,1] + rnorm(100, 0, 0.15)
> reg <- lm(y~x)
> summary(reg)$coefficient[,'Std. Error']
(Intercept)          x1          x2 
 0.03842706  0.05494507  0.05243990 
> sqrt(diag(vcov(reg)))
(Intercept)          x1          x2 
 0.03842706  0.05494507  0.05243990 
> sqrt(diag( summary(reg)$sigma^2*summary(reg)$cov.unscaled ))
(Intercept)          x1          x2 
 0.03842706  0.05494507  0.05243990 
> x_ = cbind(rep(1,nrow(x)),x)
> sqrt(diag( sum(reg$residuals^2)/(nrow(x)-ncol(x)-1) * solve(t(x_) %*% x_) ))
[1] 0.03842706 0.05494507 0.05243990

有关详细的数学,请检查普通最小二乘中参数的标准误差如何推导线性回归中系数的方差-协方差矩阵