我正在执行参数引导以测试我的模型中是否需要特定的固定效果。我这样做主要是为了锻炼,如果我的程序到目前为止是正确的,我很感兴趣。
首先,我拟合要比较的两个模型。其中一个包括要测试的效果,另一个不包括。当我测试我设置的固定效果时REML=FALSE:
mod8 <- lmer(log(value)
~ matching
+ (sentence_type | subject)
+ (sentence_type | context),
data = wort12_lmm2_melt,
REML = FALSE)
mod_min <- lmer(log(value)
~ 1
+ (sentence_type | subject)
+ (sentence_type | context),
data = wort12_lmm2_melt,
REML = FALSE)
两种模型都适合包含少量缺失值的平衡数据集。70 个受试者的观测值略高于 11000 个。每个受试者只看到每个项目一次。因变量是阅读时间;sentence_type 和matching 是两级因素。上下文和主题被视为随机效应。上下文有 40 个级别。
我打电话给方差分析():
anova(mod_min, mod8)
并获得输出:
Data: wort12_lmm2_melt
Models:
mod_min: log(value) ~ 1 + (sentence_type | subject) + (sentence_type | context)
mod8: log(value) ~ matching + (sentence_type | subject) + (sentence_type |
mod8: context)
Df AIC BIC logLik deviance Chisq Chi Df Pr(>Chisq)
mod_min 8 3317.6 3375.8 -1650.8 3301.6
mod8 9 3310.9 3376.4 -1646.4 3292.9 8.6849 1 0.003209 **
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
不相信全能的 p 我手动设置了一个参数引导程序:
mod <- mod8
modnull <- mod_min
lrt.obs <- anova(mod, modnull)$Chisq[2] # save the observed likelihood ratio test statistic
n.sim <- 10000
lrt.sim <- numeric(n.sim)
dattemp <- mod@frame
# pb <- txtProgressBar(min = 0, max = n.sim, style = 3) # set up progress bar to satisfy need for control
for(i in 1:n.sim) {
# Sys.sleep(0.1) # progress bar related stuff
ysim <- unlist(simulate(modnull) # simulate new observations from the null-model
modnullsim <- lmer(ysim
~ 1
+ (sentence_type | subject)
+ (sentence_type | context),
data = dattemp,
REML = FALSE) # fit the null-model
modaltsim <- lmer(ysim
~ matching
+ (sentence_type | subject)
+ (sentence_type | context),
data = dattemp,
REML = FALSE) # fit the alternative model
lrt.sim[i] <- anova(modnullsim, modaltsim)$Chisq[2] # save the likelihood ratio test statistic
# setTxtProgressBar(pb, i)
}
# assuming chi-squared distribution for comparison
pchisq((2*(logLik(mod8)-logLik(mod_min))),
df = 1,
lower = FALSE)
输出:
'log Lik.' 0.003208543 (df=9)
与参数引导 p 值比较
p_mod8_mod_min <- (sum(lrt.sim>=lrt.obs)+1)/(n.sim+1) # p-value. alternative: mean(lrt.sim>lrt.obs)
输出:
[1] 0.00319968
绘制整个事情:
xx <- seq(0, 20, 0.1)
hist(lrt.sim,
xlim = c(0, max(c(lrt.sim, lrt.obs))),
col = "blue",
xlab = "likelihood ratio test statistic",
ylab = "density",
cex.lab = 1.5,
cex.axis = 1.2,
freq = FALSE)
abline(v = lrt.obs,
col = "orange",
lwd = 3)
lines(density(lrt.sim),
col = "blue")
lines(xx,
dchisq(xx, df = 1),
col = "red")
box()
产生:

我确实有一些问题:
(1) 程序是正确的还是我犯了错误?
(2)如何解释直方图?
(3) 直方图的形式是正常的还是极端的?
谢谢你的帮助!