我有可以使用模型描述的数据, 在哪里和. 有 48 组 768 个观测值,每组 16 个观测值,所以和. 我想计算 95% 的置信区间. 我决定尝试拟合模型使用 lme4 包的数据并使用包的计算置信区间功能。我创建了一个模拟来检查置信区间的覆盖范围;代码如下。覆盖率似乎在 94.5% 左右,而不是 95%。我的模拟中是否存在缺陷,或者这个结果是意料之中的?
# Load packages -----------------------------------------------------------
library(lme4)
library(tidyverse)
# Define functions --------------------------------------------------------
make_dataset <- function(groups, group_size,
mu_alpha, sigma_alpha, sigma_epsilon) {
replicate(
groups,
{
group_mean <- rnorm(1, mean = mu_alpha, sd = sigma_alpha)
errors <- rnorm(group_size, sd = sigma_epsilon)
tibble(y = group_mean + errors)
},
simplify = FALSE
) %>%
bind_rows(.id = "group") %>%
mutate(group = as_factor(group))
}
check_conf_int <- function(dataset, mu_alpha) {
mod <- lmer(y ~ (1 | group), dataset)
conf_int <- confint(mod, "(Intercept)")
between(mu_alpha, conf_int[, "2.5 %"], conf_int[, "97.5 %"])
}
# Set simulation parameters -----------------------------------------------
groups <- 48
group_size <- 16
mu_alpha <- 0
sigma_alpha <- 1
sigma_epsilon <- 0.25
runs <- 10000
# Make datasets -----------------------------------------------------------
set.seed(1)
# Takes 5 minutes
datasets <- replicate(
runs,
make_dataset(groups, group_size, mu_alpha, sigma_alpha, sigma_epsilon),
simplify = FALSE
)
# Run simulation ---------------------------------------------------------
# Takes 30 minutes
results <- map_lgl(datasets, check_conf_int, mu_alpha = mu_alpha)
tibble(
runs = seq_len(runs),
coverage = cummean(results)
) %>%
ggplot(aes(runs, coverage)) +
geom_line() +
geom_hline(linetype = "dashed", yintercept = 0.95) +
coord_cartesian(ylim = c(0.93, 0.97))
下面是最后一个代码块产生的图:
下面是输出:
R version 4.0.0 (2020-04-24)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Catalina 10.15.6
Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] forcats_0.5.0 stringr_1.4.0 dplyr_0.8.5 purrr_0.3.4 readr_1.3.1
[6] tidyr_1.0.3 tibble_3.0.1 ggplot2_3.3.0 tidyverse_1.3.0 lme4_1.1-23
[11] Matrix_1.2-18
loaded via a namespace (and not attached):
[1] statmod_1.4.34 tidyselect_1.1.0 xfun_0.13 splines_4.0.0
[5] haven_2.2.0 lattice_0.20-41 colorspace_1.4-1 vctrs_0.3.0
[9] generics_0.0.2 htmltools_0.4.0 yaml_2.2.1 rlang_0.4.6
[13] nloptr_1.2.2.1 pillar_1.4.4 withr_2.2.0 glue_1.4.0
[17] DBI_1.1.0 dbplyr_1.4.3 modelr_0.1.7 readxl_1.3.1
[21] lifecycle_0.2.0 cellranger_1.1.0 munsell_0.5.0 gtable_0.3.0
[25] rvest_0.3.5 evaluate_0.14 labeling_0.3 knitr_1.28
[29] fansi_0.4.1 broom_0.5.6 Rcpp_1.0.4.6 backports_1.1.6
[33] scales_1.1.1 jsonlite_1.6.1 farver_2.0.3 fs_1.4.1
[37] hms_0.5.3 packrat_0.5.0 digest_0.6.25 stringi_1.4.6
[41] grid_4.0.0 cli_2.0.2 tools_4.0.0 magrittr_1.5
[45] crayon_1.3.4 pkgconfig_2.0.3 MASS_7.3-51.6 ellipsis_0.3.0
[49] xml2_1.3.2 reprex_0.3.0 lubridate_1.7.8 assertthat_0.2.1
[53] minqa_1.2.4 rmarkdown_2.1 httr_1.4.1 rstudioapi_0.11
[57] R6_2.4.1 boot_1.3-25 nlme_3.1-147 compiler_4.0.0
更新
如果我将组增加到 480,下面是我得到的图:
这是我将 group_size 更改为 160 时得到的图:





