来自线性混合模型的条件类内相关(ICC)作为重测信度的证据?

机器算法验证 混合模式 多层次分析 可靠性 层次聚类 类内相关
2022-04-05 13:41:17

在我的两个条件下(受试者间设计)的实验中,参与者完成了 3 次单项量表:(1)实验操作前,(2)实验操作后,(3)研究结束时。我想通过三个测量来测试我的单项量表的重测信度。

这里的问题是,已知实验操作会影响测量中的响应。我假设我需要控制操作的任何影响,所以我的猜测是使用线性混合模型(分层线性模型),测量值嵌套在参与者中,并将实验条件作为一个因素。这样,我可以从这个条件模型中得到一个类内相关性(ICC)。

是否有可能将此条件 ICC 解释为“在控制了实验效果后对重测信度的测量”?是否有任何现有的研究使用类似的方法?

2个回答

是的,您可以这样做并按照您的想法进行解释。我在 Sophia Rabe-Hesketh 和 Anders Skrondal 的 Multilevel and Longitudinal Modeling using Stata一书(第 1 卷)的第二章中读到了这种解释。

更详细的解释如下。编辑:我还添加了一个模拟来演示发生了什么。向 Ariel Muldoon 致敬,感谢他写了一篇有用的博客文章,帮助我创建了这个模拟。

在没有预测变量的随机截距模型中, 我们得到两个方差,一个用于,即,另一个用于,即

yij=β0+u0j+ϵij
u0jψϵijθ

从这些我们可以将主体之间的依赖或可靠性()表示为: ρ

ρ=ψψ+θ

在这个等式中,是受试者真实分数的方差,而是测量误差方差,或测量的平方标准误差。由于重复测量而成为重测信度。ψβ0+u0jθρ

与 Pearson 相关系数相比,受测量的任何线性变换的影响,这可能包括从时间 1 到时间 2 的实践效果或实验引起的增加。因此,如果您知道数据中的某些内容会导致线性变化,您必须在混合模型中考虑它。ρ

在您的情况下,您有一个随时间变化的实验操作(称为)。在您的随机截距模型中包含作为预测变量,x1x1

yij=β0+β1x1+u0j+ϵij

都有影响这样做,得到的估计值不再受的影响,并且你有一个对实验效果稳健的重测信度估计值。ψθψθx1


模拟

set.seed(807)

npart=1000 # number of particpants
ntime=3    # numer of observations (timepoints) per participant
mu=2.5     # mean value on the Likert item
sdp=1      # standard deviation of participant random effect (variance==1)
sd=.7071   # standard deviation of within participant (residual; variance = .5)

participant = rep(rep(1:npart, each = nobs),ntime)  # creating 1000 participants w/ 3 repeats
participant = participant[order(participant)]
time = rep(rep(1:ntime, each=1),1000)    # creating a time variable

parteff = rnorm(npart, 0, sdp)     # drawing from normal for participant deviation
parteff = rep(parteff, each=ntime) # ensuring participant effect is same for three observations

timeeff = rnorm(npart*ntime, 0, sd) # drawing from normal for within-participant residual

dat=data.frame(participant, time, parteff, timeeff) # create data frame

dat$resp = with(dat, mu + parteff + timeeff ) # creating response for each individual

#Variance components model
library(lme4)

m1 <- lmer(resp ~ 1 + (1|participant), dat)
summary(m1) # estimates close to simulated values


Linear mixed model fit by REML ['lmerMod']
Formula: resp ~ 1 + (1 | participant)
   Data: dat

REML criterion at convergence: 8523.8

Scaled residuals: 
     Min       1Q   Median       3Q      Max 
-3.13381 -0.57238  0.01722  0.57846  2.84918 

Random effects:
 Groups      Name        Variance Std.Dev.
 participant (Intercept) 1.0110   1.0055  
 Residual                0.5314   0.7289  
Number of obs: 3000, groups:  participant, 1000

Fixed effects:
            Estimate Std. Error t value
(Intercept)  2.54142    0.03447   73.73


#Add treatment variable x1 which turns on at time 3
dat$trtmt = rep(c(0,0,1),1000)
b1 = .4 #average amount by which particpant's score increases b/c of treatment
x1 = runif(npart, .05, 1.5)


library(dplyr)
dat <- dat %>% mutate(resp2=case_when
                      (time==3 ~ (mu+b1*x1+parteff+timeeff),
                        TRUE ~ resp))
glimpse(dat)

#run m1 without covariate for trtmt
m2 <- lmer(resp2 ~ 1 + (1|participant), dat)
summary(m2)

Linear mixed model fit by REML ['lmerMod']
Formula: resp2 ~ 1 + (1 | participant)
   Data: dat

REML criterion at convergence: 8659.9

Scaled residuals: 
     Min       1Q   Median       3Q      Max 
-2.72238 -0.56861  0.01894  0.57177  3.10610 

Random effects:
 Groups      Name        Variance Std.Dev.
 participant (Intercept) 1.0070   1.0035  
 Residual                0.5669   0.7529  
Number of obs: 3000, groups:  participant, 1000

Fixed effects:
            Estimate Std. Error t value
(Intercept)  2.64169    0.03458   76.39


#add trtmt as a fixed effect predictor
m3 <- lmer(resp2 ~ 1 + trtmt + (1|participant), dat)
summary(m3)

Linear mixed model fit by REML ['lmerMod']
Formula: resp2 ~ 1 + trtmt + (1 | participant)
   Data: dat

REML criterion at convergence: 8546.7

Scaled residuals: 
     Min       1Q   Median       3Q      Max 
-3.06878 -0.57650  0.02712  0.57887  2.89709 

Random effects:
 Groups      Name        Variance Std.Dev.
 participant (Intercept) 1.0178   1.0088  
 Residual                0.5346   0.7311  
Number of obs: 3000, groups:  participant, 1000

Fixed effects:
            Estimate Std. Error t value
(Intercept)  2.53746    0.03585   70.78
trtmt        0.31270    0.02832   11.04

Correlation of Fixed Effects:
      (Intr)
trtmt -0.263

> texreg::screenreg(c(m1, m2, m3))

======================================================================
                              Model 1       Model 2       Model 3     
----------------------------------------------------------------------
(Intercept)                       2.54 ***      2.64 ***      2.54 ***
                                 (0.03)        (0.03)        (0.04)   
trtmt                                                         0.31 ***
                                                             (0.03)   
----------------------------------------------------------------------
AIC                            8529.83       8665.86       8554.72    
BIC                            8547.85       8683.88       8578.75    
Log Likelihood                -4261.92      -4329.93      -4273.36    
Num. obs.                      3000          3000          3000       
Num. groups: participant       1000          1000          1000       
Var: participant (Intercept)      1.01          1.01          1.02    
Var: Residual                     0.53          0.57          0.53    
======================================================================
*** p < 0.001; ** p < 0.01; * p < 0.05

这篇文章真的帮助了我,我想感谢你。如果其他用户遇到我遇到的相同问题 - 我将在上面的模拟中添加一些细微的变化。完全相同没什么特别的 - 只是很高兴看到数字匹配 :) 此外,参与者向量中的轻微修正以使其工作。ρ

干杯

日山

set.seed(807)

npart=1000 # number of particpants
ntime=2   # numer of observations (timepoints) per participant
mu=2.5     # mean value on the Likert item
sdp=1      # standard deviation of participant random effect (variance==1)
sd=.7071   # standard deviation of within participant (residual; variance = .5)

participant = rep(rep(1:npart, each = nobs),ntime)  # creating 1000 participants w/ 3 repeats
participant = participant[order(participant)]
time        = rep(rep(1:ntime, each=1),1000)        # creating a time variable

parteff = rnorm(npart, 0, sdp)     # drawing from normal for participant deviation
parteff = rep(parteff, each=ntime) # ensuring participant effect is same for three observations

timeeff = rnorm(npart*ntime, 0, sd) # drawing from normal for within-participant residual

dat=data.frame(participant, time, parteff, timeeff) # create data frame

dat$resp = with(dat, mu + parteff + timeeff ) # creating response for each individual

#Variance components model
library(lme4)

m1 <- lmer(resp ~ 1 + (1|participant), dat)
summary(m1) # estimates close to simulated values

#calculate pearson corr
library(reshape2)
df.wide   <-dcast(dat,participant~time,mean,value.var='resp')[,-1]
cor(df.wide)

#get the same from the HLM fit
print(VarCorr(m1))
.95478^2/(.95478^2+0.74685^2)
 ```