为了解决您的问题,您可以使用协变量进行逻辑回归,以同时捕获时间和组效应。
library(Hmisc)
#> Loading required package: lattice
#> Loading required package: survival
#> Loading required package: Formula
#> Loading required package: ggplot2
#>
#> Attaching package: 'Hmisc'
#> The following objects are masked from 'package:base':
#>
#> format.pval, units
library(rms)
#> Loading required package: SparseM
#>
#> Attaching package: 'SparseM'
#> The following object is masked from 'package:base':
#>
#> backsolve
library(broom)
library(modelr)
#>
#> Attaching package: 'modelr'
#> The following object is masked from 'package:broom':
#>
#> bootstrap
library(yardstick)
#> For binary classification, the first factor level is assumed to be the event.
#> Set the global option `yardstick.event_first` to `FALSE` to change this.
#>
#> Attaching package: 'yardstick'
#> The following objects are masked from 'package:modelr':
#>
#> mae, mape, rmse
library(tidyverse)
make_obs <- function(group, day){
sapply(1:length(group), function(idx){
rbinom(1, 1, day_prob[[group[idx]]][day[idx]] )
})
}
day_prob <-
list(
A = c(0.06, 0.06, pmax(dchisq(1:13, 3)*1, 0.06))
, B = c(0.06, 0.06, pmax(dchisq(seq(1, 25, 2), 5)*6, 0.06))
)
set.seed(12345)
example_data <-
tibble(
Group = rep(c("A", "B"), each = 50)
, Ind = 1:100
, Day = 1
) %>%
complete(nesting(Group, Ind), Day = 1:15) %>%
mutate(
Obs = make_obs(Group, Day)
)
example_data %>%
group_by(Group, Day) %>%
summarise(prop = mean(Obs)) %>%
ggplot(aes(Day, prop, group=Group)) +
geom_line()
通过按天检查分组图并对生成的数据进行分组,您可以看到天和组交互存在高度非线性效应。

我提出以下 4 个模型来研究 Day 和 Group 对 Obs 的影响。在模型 3 和 4 中,我对 Days 进行了非线性变换,这是一个限制三次样条曲线,试图捕捉上图中提到的效果。
df <- example_data
md_log <- glm(Obs ~ Day + Group, data = df, family = 'binomial')
md_log2 <- glm(Obs ~ Day * Group, data = df, family = 'binomial')
md_log3 <- glm(Obs ~ rcs(Day) + Group, data = df, family = 'binomial')
md_log4 <- glm(Obs ~ rcs(Day) * Group, data = df, family = 'binomial')
gather_predictions(df, md_log, md_log2, md_log3, md_log4, type='response') %>%
group_by(model, Group, Day) %>%
summarise(prop = mean(Obs),
fitted = mean(pred)
) %>%
ggplot(aes(x = Day, group=Group)) +
geom_line(aes(y = prop)) +
geom_line(aes(y = fitted, group=model, color = model), lty='dashed') +
facet_wrap(~Group, ncol =1)
我收集来自每个模型的预测并将它们汇总以进行视觉检查。如您所见,模型 3 和 4 捕获了您感兴趣的效果。

md_frame <-
tibble(number= seq(4), model = list(md_log, md_log2, md_log3, md_log4))
get_auc <- . %>% augment( . , df, type.predict='response') %>%
mutate(Obs = as.factor(Obs)) %>%
yardstick::roc_auc(., truth=Obs, .fitted) %>%
pull(.estimate)
md_frame %>%
mutate(auc = map_dbl(model, get_auc)) %>%
mutate(other_metrics = map(model, glance)) %>%
unnest(other_metrics) %>%
select(model_number = number, auc, AIC, deviance, BIC, logLik) -> metrics
metrics %>%
pivot_longer(-model_number, names_to = 'metric_name', values_to='values') %>%
mutate(model_number = paste0('model ', model_number)) %>%
ggplot(aes(model_number, values)) +
geom_point(aes(fill=model_number), size = 4, pch=21, show.legend = F) +
facet_wrap(~metric_name, scales = 'free') +
labs(title = 'Model metrics', x = NULL, y=NULL)
现在,要检查用于访问每个模型的拟合质量的指标,还表明模型 3 和 4 捕获了 Days 效应。通过检查模型,您可以检查系数。

现在,如果您打算使用这些模型执行一些统计推断,那么在估计嵌套方差和时间相关效应时有更好的方法,因此纵向方法在这里可能是有利的。
由reprex 包(v0.3.0)于 2019 年 11 月 14 日创建