如何从 R 零膨胀计数数据回归中获得标准误差?

机器算法验证 r 广义线性模型 计数数据 零通胀
2022-03-31 03:15:50

以下代码

PredictNew <- predict (glm.fit, newdata = Predict, X1 =X1, Y1= Y1, 
                       type = "response", se.fit = TRUE)

生成 3 列data.frame--PredictNew、拟合值、标准误差和残差比例项。

完美...但是使用配备以下设备的模型zeroinfl {pscl}

PredictNew <- predict (zeroinfl.fit, newdata = Predict, X1 =X1, Y1= Y1, 
                       type = "response", se.fit = TRUE)

或者

PredictNew <- predict (zeroinfl.fit, newdata = Predict, X1 =X1, Y1= Y1, 
                       type = "response", se.fit = TRUE, MC = 2500, conf = .95))

仅生成拟合值的单列向量。但是,我非常希望有标准错误。我读过的所有内容都说应该生产它们。

(代码已经稍微简化了,我实际上有四个变量和一个偏移量 - 没有问题predict.glmse.fit = TRUE产生 SE。)

1个回答

据我所知,predict结果的方法zeroinfl不包括标准误差。如果您的目标是构建置信区间,一种有吸引力的替代方法是使用自举。我说很有吸引力,因为自举有可能变得更健壮(如果满足 SE 的所有假设,则会降低效率)。

这是一些粗略的代码来做你想做的事。它不会完全起作用,但希望您可以进行必要的更正。

## load boot package
require(boot)
## output coefficients from your original model
## these can be used as starting values for your bootstrap model
## to help speed up convergence and the bootstrap
dput(round(coef(zeroinfl.fit, "count"), 3))
dput(round(coef(zeroinfl.fit, "zero"), 3))

## function to pass to the boot function to fit your model
## needs to take data, an index (as the second argument!) and your new data
f <- function(data, i, newdata) {
  require(pscl)
  m <- zeroinfl(count ~ child + camper | persons, data = data[i, ], start = list(count = c(1.598, -1.0428, 0.834), zero = c(1.297, -0.564)))
  mparams <- as.vector(t(do.call(rbind, coef(summary(m)))[, 1:2]))
  yhat <- predict(m, newdata, type = "response")
  return(c(mparams, yhat))    
}

## set the seed and do the bootstrap, make sure to set your number of cpus
## note this requires a fairly recent version of R
set.seed(10)
res <- boot(dat, f, R = 1200, newdata = Predict, parallel = "snow", ncpus = 4)

## get the bootstrapped percentile CIs
## the 10 here is because in my initial example, there were 10 parameters before predicted values
yhat <- t(sapply(10 + (1:nrow(Predict)), function(i) {
  out <- boot.ci(res, index = i, type = c("perc"))
  with(out, c(Est = t0, pLL = percent[4], pUL = percent[5]))
}))

## merge CIs with predicted values
Predict<- cbind(Predict, yhat)

我从我写的两页中提取了这段代码,一个是来自零膨胀泊松回归的引导参数,zeroinfl 另一个是演示如何从零截断负二项式模型中获取预测值的自举置信区间零截断负二项式. 结合起来,希望能为您提供足够的示例,使其与零膨胀泊松的预测值一起工作。您可能还会得到一些绘图的想法:)