对于 OLS,和。我可以“手动”复制这些。
对于 WLS,在对角线中具有异方差误差和权重,和
。我也可以“手动”复制这些。
替换估计来作用于方差估计。例如对于 HC0 (Zeiles 2004 JSS),使用平方残差。我还可以为 OLS 和 WLS 复制这些“手动”(参见下面的代码)。
但是,summary(lm_wls) 产生的 SE 估计值为 0.25081。我无法“手动”复制这个估计。请参阅下面的“### <-- HERE”箭头处的位置。那里的“diag(sig^2_i)”插入了什么?
###########################################################
### DATA GENERATION #######################################
###########################################################
set.seed(1234)
# Generate a covariate
x <- rnorm(100)
# Generate the propensity score
ps <- (1 + exp(-(-.5 + .8*x)))^-1
# Generate the exposure (i.e., treatment) variable
z <- rbinom(n = 100, size = 1, prob = ps)
# Generate the outcome
y <- 1.1*x - .6*z + rnorm(100, sd = .5)
### Estimate the average treatment effect by OLS
###########################################################
### ORDINARY LEAST SQUARES REGRESSION #####################
###########################################################
### OLS via lm()
lm_ols <- lm(formula = y ~ x + z)
summary(lm_ols)
### Verify OLS 'by hand'
X <- cbind(1, x, z)
(betas <- solve(t(X) %*% X) %*% t(X) %*% y)
(resid_var <- sqrt(sum(lm_ols$residuals^2)/(100 - 3)))
(var_betas <- solve(t(X) %*% X) %*%
(t(X) %*% diag(resid_var^2, 100, 100) %*% X)
%*% solve(t(X) %*% X))
sqrt(diag(var_betas)) # SEs -- Compare to summary(lm_ols)
### Sandwich SEs based on squared residuals
### (see p.4 of Zeiles 2004 JSS)
library(sandwich)
(vcovHC(x = lm_ols, type = "HC0", sandwich = TRUE))
sqrt(diag(vcovHC(x = lm_ols, type = "HC0", sandwich = TRUE)))
### Verify Sandwich SEs for the OLS model
var_betas <- solve(t(X) %*% X) %*%
t(X) %*% diag(lm_ols$residuals^2) %*% X %*%
solve(t(X) %*% X)
sqrt(diag(var_betas)) # SEs -- Compare to sandwich SEs
### Estimate the average treatment effect by propensity
### score weighting
###########################################################
### INVERSE PROPENSITY SCORE WEIGHTING ####################
###########################################################
### Estimate propensity scores
glm1 <- glm(z ~ x, family = "binomial")
ps_est <- predict(glm1, type = "response")
### Create inverse ps weights
wts <- (z/ps_est) + (1-z)/(1-ps_est)
### Estimate average treatment effect via WLS
lm_wls <- lm(formula = y ~ z, weights = wts)
summary(lm_wls)
### Verify WLS
X <- cbind(1, z)
W <- diag(wts)
(betas2 <- solve(t(X) %*% W %*% X) %*% t(X) %*% W %*% y)
(resid_var2 <- sqrt(sum(wts*(lm_wls$residuals^2))/(100 - 2)))
(var_betas2 <- solve(t(X) %*% W %*% X) %*%
(t(X) %*% W %*% "diag(sig^2_i)" %*% t(W) %*% X) %*% ### <-- HERE
solve(t(X) %*% W %*% X))
sqrt(diag(var_betas2)) # SEs
### Sandwich SEs with sandwich package
library(sandwich)
vcovHC(x = lm_wls, type = "HC0", sandwich = TRUE)
sqrt(diag(vcovHC(x = lm_wls, type = "HC0", sandwich = TRUE)))
### Verify Sandwich SEs for the WLS model
(var_betas3 <- solve(t(X) %*% W %*% X) %*%
(t(X) %*% W %*% diag(lm_wls$resid^2) %*%
t(W) %*% X) %*% solve(t(X) %*% W %*% X))
sqrt(diag(var_betas3)) # SEs -- Compare to sandwich SEs