我一直在开发基于英国国家头部创伤数据库的回顾性数据的逻辑回归模型。关键结果是 30 天死亡率(表示为“生存”度量)。其他已发表证据表明对先前研究的结果有显着影响的措施包括:
Year - Year of procedure = 1994-2013
Age - Age of patient = 16.0-101.5
ISS - Injury Severity Score = 0-75
Sex - Gender of patient = Male or Female
inctoCran - Time from head injury to craniotomy in minutes =
0-2880 (After 2880 minutes is defined as a separate diagnosis)
使用这些模型,给定二分因变量,我使用lrm
.
模型变量选择的方法是基于现有的临床文献建模相同的诊断。除了传统上通过分数多项式建模的 ISS 之外,所有模型都采用线性拟合进行建模。没有出版物确定上述变量之间已知的显着相互作用。
根据 Frank Harrell 的建议,我继续使用回归样条对 ISS 建模(下面的评论中强调了这种方法的优点)。因此,模型预先指定如下:
rcs.ASDH <- lrm(formula = Survive ~ Age + GCS + rcs(ISS) +
Year + inctoCran + oth, data = ASDH_Paper1.1, x=TRUE, y=TRUE)
该模型的结果是:
> rcs.ASDH
Logistic Regression Model
lrm(formula = Survive ~ Age + GCS + rcs(ISS) + Year + inctoCran +
oth, data = ASDH_Paper1.1, x = TRUE, y = TRUE)
Model Likelihood Discrimination Rank Discrim.
Ratio Test Indexes Indexes
Obs 2135 LR chi2 342.48 R2 0.211 C 0.743
0 629 d.f. 8 g 1.195 Dxy 0.486
1 1506 Pr(> chi2) <0.0001 gr 3.303 gamma 0.487
max |deriv| 5e-05 gp 0.202 tau-a 0.202
Brier 0.176
Coef S.E. Wald Z Pr(>|Z|)
Intercept -62.1040 18.8611 -3.29 0.0010
Age -0.0266 0.0030 -8.83 <0.0001
GCS 0.1423 0.0135 10.56 <0.0001
ISS -0.2125 0.0393 -5.40 <0.0001
ISS' 0.3706 0.1948 1.90 0.0572
ISS'' -0.9544 0.7409 -1.29 0.1976
Year 0.0339 0.0094 3.60 0.0003
inctoCran 0.0003 0.0001 2.78 0.0054
oth=1 0.3577 0.2009 1.78 0.0750
然后我使用了 rms 包中的校准函数来评估模型预测的准确性。获得了以下结果:
plot(calibrate(rcs.ASDH, B=1000), main="rcs.ASDH")
完成模型设计后,我创建了下图来展示事件年份对生存的影响,基于连续变量的中位数和分类变量的众数:
ASDH <- Predict(rcs.ASDH, Year=seq(1994,2013,by=1), Age=48.7,
ISS=25, inctoCran=356, Other=0, GCS=8, Sex="Male",
neuroYN=1, neuroFirst=1)
Probabilities <- data.frame(cbind(ASDH$yhat,
exp(ASDH$yhat)/(1+exp(ASDH$yhat)),
exp(ASDH$lower)/(1+exp(ASDH$lower)),
exp(ASDH$upper)/(1+exp(ASDH$upper))))
names(Probabilities) <- c("yhat", "p.yhat", "p.lower", "p.upper")
ASDH <- merge(ASDH, Probabilities, by="yhat")
plot(ASDH$Year, ASDH$p.yhat, xlab="Year", ylab="Probability of
Survival", main="30 Day Outcome Following Craniotomy for Acute SDH
by Year", ylim=range(c(ASDH$p.lower,ASDH$p.upper)), pch=19)
arrows(ASDH$Year, ASDH$p.lower, ASDH$Year, ASDH$p.upper,
length=0.05, angle=90, code=3)
上面的代码产生了以下输出:
我剩下的问题如下:
1. Spline Interpretation - 我如何计算结合整体变量的样条曲线的 p 值?