机器算法验证 - 在 R 中解释负二项式混合模型输出 - 吾爱随笔录

在 R 中解释负二项式混合模型输出

机器算法验证 r 混合模式广义线性模型泊松回归泊松分布

2022-03-31 21:30:27

我正在模拟一年中的工作数量，并随机截取以考虑区域差异。使用模型进行预测/推理时如何解释模型输出？

例如，使用 log-link 函数，我将执行以下操作来计算伦敦第 2 年的预期工作数量：

exp(Intercept + Year*2 + London_intercept) = exp(0.23290 + -0.13369*2 + 0.42820729) = 1.482496

但这个值太低了，我预计一年内的工作岗位将超过 1.5 个。我是否将其解释为比平均值增加 70%？如果是这样是什么意思？有人可以澄清该怎么做吗？

总结输出：

> summary(m8)
Generalized linear mixed model fit by maximum likelihood (Laplace Approximation) ['glmerMod']
 Family: Negative Binomial(0.9321)  ( log )
Formula: Jobs ~ 1 + Year + (1 | Region)
   Data: df_jobs

     AIC      BIC   logLik deviance df.resid 
  3554.5   3575.9  -1773.3   3546.5     1564 

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-0.7836 -0.6187 -0.5081  0.3613  7.2595 

Random effects:
 Groups Name        Variance Std.Dev.
 Region (Intercept) 0.1597   0.3996  
Number of obs: 1568, groups:  Region, 14

Fixed effects:
            Estimate Std. Error z value Pr(>|z|)    
(Intercept)  0.23290    0.13610   1.711    0.087 .  
Year        -0.13369    0.01572  -8.502   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Correlation of Fixed Effects:
     (Intr)
Year -0.536

区域的随机截距：

$Region
                      (Intercept)
East Midlands         -0.02413869
East of England        0.18654921
Kent, Surrey & Sussex  0.14610941
London                 0.42820729
North East            -0.42892509
North West             0.48852281
Northern Ireland      -0.26330415
Scotland               0.41420383
South West             0.15415043
Thames Valley         -0.80072297
Wales                 -0.05975189
Wessex                -0.53037342
West Midlands          0.15126500
Yorkshire & Humber     0.21339728

1个回答

该模型有以下解释：

foYear的估计值是 1 个单位增加的负二项式回归估计值year。假设每个“单位”为 1 年，那么每增加一年，预期工作数量的对数差异预计将减少 -0.13369，取幂为 0.875，这意味着总体而言，工作岗位减少 12.5%每增加1年
您可以添加截距和随机效应以获得每个区域的单独估计 - 正如您所做的那样 - 但是由于部分汇集，这些将缩小。另外，请注意，您的公式 exp(Intercept + Year*2 + London_intercept) = exp(0.23290 + -0.13369*2 + 0.42820729) = 1.482496意味着工作岗位每增加2 年就会增加 48%。Year
如果您想要未缩小的估计值，那么您应该拟合固定效应Year
在诸如此类的模型中，与时间有非线性关联是很常见的，因此您可能需要考虑添加一个额外的非线性项（例如二次项）或使用样条曲线。如果使用前者，最好Year先将变量居中。
另一种可能性是通过拟合随机斜率来允许“效果”Year因地区而异：

    Jobs ~ 1 + Year + (Year | Region)

其它你可能感兴趣的问题

上一篇混合方差分析正态性：应该检查哪些变量？（在 stats::aov 的普遍和实际应用中）下一篇时间点是嵌套在学生中还是在纵向多层次模型中交叉