我正在使用二项式回归,分类因子有 9 个级别(名为“treat.group”),每组的样本量为 1-8。1 个治疗组具有所有阳性病例(即 1) - 这会在 R 中的“标准” glm() 函数中产生一个估计问题,该问题是由该治疗水平的“完美分离”引起的。所以,我正在使用 arm 包中的bayesglm。
我的问题是默认身份链接是“逻辑”,但我读过“cloglog”或(补充日志)在事件的概率非常小或非常大时经常使用。因此,由于我的模型在 1 个治疗组中表现出完美的分离,因此事件的概率非常大,我应该使用“cloglog”。使用 cloglog 为治疗组提供了显着的结果,完美分离,而“logit”则没有。
我是否有理由使用“cloglog”,或者有没有办法查看我的结果并确定哪个链接最好?
f1<- bayesglm(response~ treat.group,family=binomial(link="logit"), data=df)
f2 <- bayesglm(response~ treat.group,family=binomial(link="cloglog"), data=df)
(下面的数据框)
{ structure(list(response = c(0L, 1L, 1L, 0L, 0L, 1L, 1L, 1L, 0L, 1L, 1L, 0L, 0L, 1L, 0L, 1L, 1L, 0L, 1L, 1L, 0L, 0L, 0L, 0L, 1L, 1L, 0L, 0L, 0L, 1L, 1L, 0L, 1L, 0L, 1L, 0L, 1L, 1L, 1L, 1L), treat.group = c("pctc", "phth", "phth", "phtl", "pltc", "pcth", "pltl", "phtc", "pctl", "phtc", "pcth", "pctl", "pctc", "phtl", "plth", "pltc", "phtc", "pcth", "phtl", "plth", "pctl", "pltc", "phtl", "pctc", "pcth", "pltc", "phtc", "phtl", "phtc", "pctl", "pctc", "pcth", "phth", "pctc", "phtl", "pcth", "phth", "phtc", "pcth", "phth")), .Names = c("response", "treat.group"), row.names = c(NA, -40L), class = c("tbl_df", "tbl", "data.frame"), na.action = structure(c(1L, 4L, 5L, 7L, 15L, 21L, 23L, 24L, 27L, 29L, 33L, 37L, 39L, 48L, 50L, 53L, 54L, 55L, 56L, 57L, 58L, 59L, 60L, 62L, 63L, 65L, 66L, 67L, 68L, 70L, 71L, 72L, 73L), .Names = c("1", "4", "5", "7", "15", "21", "23", "24", "27", "29", "33", "37", "39", "48", "50", "53", "54", "55", "56", "57", "58", "59", "60", "62", "63", "65", "66", "67", "68", "70", "71", "72", "73"), class = "omit")) }