机器算法验证 - 在 R 中，测试 lm 中的系数是否与给定值不同（除了零） - 吾爱随笔录

在 R 中，测试 lm 中的系数是否与给定值不同（除了零）

机器算法验证 r 回归系数推理

2022-03-23 23:08:52

在 R 中，有没有办法使用该lm函数来检验系数与非零值不同的假设？例如，如果模型是：

Y = a + b1x1 + b2x2 + b3x3 + e

很容易测试单个b与任意数字是否不同。如果你想测试b1 = 10，那么你可以估计：

h0 <- lm(Y ~ offset(10*x1) + x2 + x3)
h1 <- lm(Y ~ x1 + x2 + x3)
anova(h0,h1)

但是，如果您想同时测试a=3, b1=10,b2=9并b3=4获得每个的 p 值怎么办？然后，anova 只会测试这些限制是否同时都为真。有没有办法一次测试这些限制但得到每个系数的 p 值？

2个回答

除了anova()用于执行 F 检验之外，您还可以summary()用于获取每个系数的边际 Wald 检验。因此很容易简单地包含所有术语的整个偏移量，然后使用summary(). 在这里，我使用了一些带有系数的人工数据，如您的帖子中所建议的：

set.seed(1)
d <- data.frame(
  x1 = runif(100, -1, 1),
  x2 = runif(100, -1, 1),
  x3 = runif(100, -1, 1)
)
d$y <- 3 + 10 * d$x1 + 9 * d$x2 + 4 * d$x3 + rnorm(100)
m <- lm(y ~ x1 + x2 + x3, data = d,
  offset = 3 + 10 * x1 + 9 * x2 + 4 * x3)
summary(m)
## Call:
## lm(formula = y ~ x1 + x2 + x3, data = d, offset = 3 + 10 * x1 + 
##     9 * x2 + 4 * x3)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -3.00507 -0.67884 -0.08825  0.68643  2.55013 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)
## (Intercept)  0.03520    0.10947   0.322    0.748
## x1          -0.09123    0.20091  -0.454    0.651
## x2          -0.19222    0.19572  -0.982    0.329
## x3           0.01885    0.19285   0.098    0.922
## 
## Residual standard error: 1.058 on 96 degrees of freedom
## Multiple R-squared:  0.9824, Adjusted R-squared:  0.9818 
## F-statistic:  1784 on 3 and 96 DF,  p-value: < 2.2e-16

进行这些线性假设检验的另一个方便的选择是包中的linearHypothesis()函数car。这可以复制边缘 Wald 测试：

library("car")
m2 <- lm(y ~ x1 + x2 + x3, data = d)
linearHypothesis(m2, "(Intercept) = 3")
## Linear hypothesis test
## 
## Hypothesis:
## (Intercept) = 3
## 
## Model 1: restricted model
## Model 2: y ~ x1 + x2 + x3
## 
##   Res.Df    RSS Df Sum of Sq      F Pr(>F)
## 1     97 107.67                           
## 2     96 107.55  1   0.11586 0.1034 0.7485

...

linearHypothesis(m2, "x3 = 4")
## Linear hypothesis test
## 
## Hypothesis:
## x3 = 4
## 
## Model 1: restricted model
## Model 2: y ~ x1 + x2 + x3
## 
##   Res.Df    RSS Df Sum of Sq      F Pr(>F)
## 1     97 107.56                           
## 2     96 107.55  1    0.0107 0.0096 0.9224

并且还可以对所有系数进行F检验：

linearHypothesis(m2,
  c("(Intercept) = 3", "x1 = 10", "x2 = 9", "x3 = 4"))
## Linear hypothesis test
## 
## Hypothesis:
## (Intercept) = 3
## x1 = 10
## x2 = 9
## x3 = 4
## 
## Model 1: restricted model
## Model 2: y ~ x1 + x2 + x3
## 
##   Res.Df    RSS Df Sum of Sq     F Pr(>F)
## 1    100 108.93                          
## 2     96 107.55  4    1.3802 0.308  0.872

目前尚不清楚您的困惑是什么……但是模型

Y = X β + Z γ + ε

$Y = X \beta + Z \gamma + \varepsilon$

可以等价地表示为

(Y - X β^{0}) = X (β - β^{0}) + Z γ + ε

$\left( Y - X\beta^0 \right) = X \left( \beta - \beta^0 \right) + Z\gamma + \varepsilon$

如果我们试图直接测试。 $\beta = \beta^0$

其它你可能感兴趣的问题

上一篇从估计的 PDF 中查找 CDF（由 KDE 估计）下一篇mgcv包中gam的平滑方法？