回想一下,偏移量只是一个预测变量,其系数固定为 1。因此,使用带有对数链接的泊松回归的标准设置,我们有:
logE(Y)=β′X+logE
其中是偏移/曝光变量。这可以重写为E
logE(Y)−logE=β′X
logE(Y/E)=β′X
您的基础随机变量仍然是,但通过除以,我们将模型方程的 LHS 转换为每单位曝光的事件率。但是这种划分也改变了响应的方差,所以我们在拟合模型时YEE
R中的示例:
library(MASS) # for Insurance dataset
# modelling the claim rate, with exposure as a weight
# use quasipoisson family to stop glm complaining about nonintegral response
glm(Claims/Holders ~ District + Group + Age,
family=quasipoisson, data=Insurance, weights=Holders)
Call: glm(formula = Claims/Holders ~ District + Group + Age, family = quasipoisson,
data = Insurance, weights = Holders)
Coefficients:
(Intercept) District2 District3 District4 Group.L Group.Q Group.C Age.L Age.Q Age.C
-1.810508 0.025868 0.038524 0.234205 0.429708 0.004632 -0.029294 -0.394432 -0.000355 -0.016737
Degrees of Freedom: 63 Total (i.e. Null); 54 Residual
Null Deviance: 236.3
Residual Deviance: 51.42 AIC: NA
# with log-exposure as offset
glm(Claims ~ District + Group + Age + offset(log(Holders)),
family=poisson, data=Insurance)
Call: glm(formula = Claims ~ District + Group + Age + offset(log(Holders)),
family = poisson, data = Insurance)
Coefficients:
(Intercept) District2 District3 District4 Group.L Group.Q Group.C Age.L Age.Q Age.C
-1.810508 0.025868 0.038524 0.234205 0.429708 0.004632 -0.029294 -0.394432 -0.000355 -0.016737
Degrees of Freedom: 63 Total (i.e. Null); 54 Residual
Null Deviance: 236.3
Residual Deviance: 51.42 AIC: 388.7