GLM 标准错误

机器算法验证 r 广义线性模型 协方差
2022-03-07 09:04:56

我有一个关于如何获得 GLM 模型中系数的标准误差的问题。我有手动计算的费希尔信息矩阵,但它没有缩放。如何缩放 Fisher 信息矩阵,以便从 GLM 函数中获得相同的标准误差?

2个回答

如何缩放 Fisher 信息矩阵,以便从 GLM 函数中获得相同的标准误差?

将未缩放的协方差矩阵与色散参数计时,如summary.glm. 相关代码summary.glm来自

if (is.null(dispersion)) 
    dispersion <- if (object$family$family %in% c("poisson", 
        "binomial")) 
        1
    else if (df.r > 0) {
        est.disp <- TRUE
        if (any(object$weights == 0)) 
            warning("observations with zero weight not used for calculating dispersion")
        sum((object$weights * object$residuals^2)[object$weights > 
            0])/df.r
    }
    else {
        est.disp <- TRUE
        NaN
    }
# [other code...]
if (p > 0) {
    p1 <- 1L:p
    Qr <- qr.lm(object)
    coef.p <- object$coefficients[Qr$pivot[p1]]
    covmat.unscaled <- chol2inv(Qr$qr[p1, p1, drop = FALSE])
    dimnames(covmat.unscaled) <- list(names(coef.p), names(coef.p))
    covmat <- dispersion * covmat.unscaled
    # [more code ...]

chol2inv(Qr$qr[p1, p1, drop = FALSE])计算_(RR)1=(XWX)1您对此发表评论。这里,R是 QR 分解的上三角矩阵QR=WX.


atiretoo 答案仅在色散参数为 1 时成立,如泊松和二项分布。

你很亲近!系数的标准误差是矩阵对角线的平方根,它是 Fisher 信息矩阵的倒数。这是一个例子。


data <- caret::twoClassSim()
model <- glm(Class~TwoFactor1*TwoFactor2, data = data, family="binomial")
# here are the standard errors we want
SE <- broom::tidy(model)$std.error

X <- model.matrix(model)
p <- fitted(model)
W <- diag(p*(1-p))
# this is the covariance matrix (inverse of Fisher information)
V <- solve(t(X)%*%W%*%X)
all.equal(vcov(model), V)
#> [1] "Mean relative difference: 1.066523e-05"
# close enough

# these are the standard errors: take square root of diagonal 
all.equal(SE, sqrt(diag(V)))
#> [1] "names for current but not for target"  
#> [2] "Mean relative difference: 4.359204e-06"