机器算法验证 - 使用 gamm4 模型预测新数据中的估计值 - 吾爱随笔录

我一直在尝试用 gamm4 来推导一些重复测量数据的 GAMM。

这些模型看起来非常漂亮，并且似乎比我的 LMM 提供了更多的灵活性。

最终，我想比较模型的不是它们的拟合质量（比较 LMM 和 GAMM 拟合的现实似乎很复杂？），而是它们在新数据集和 MCMC 模拟的新数据中的预测质量。

对于 LMM，我预测仅使用以下固定效果：

mm <- model.matrix(terms(lmer),newdata)

newdata$predicted <- mm %*% fixef(lmer)

这很好，因为我们正在预测新个体，具有新的独立随机效应。

我无法让这种预测方法与 gamm4 一起使用。

> mm <- model.matrix(terms(gamm4$mer), newdata)

Error in model.frame.default(object, data, xlev = xlev) : 
  variable lengths differ (found for 'X')

我认为这是因为 GAM 过程创建了新变量以转换预测变量。它也很复杂，因为我相信变换存储为随机效果，所以我需要提取这些随机效果，而不是“个人级别”的随机效果。

有谁知道我该怎么做：

仅从 gamm4 模型中提取变换效果项？
使用 gamm4 对新数据进行预测？
提取 GAMM 的模型规范，以便我可以将其实现为独立算法？
一般建议？

# Load the gamm4 package library(gamm4) # Using gamm4's built-in data simulation capabilities to give us some data: set.seed(100) dat <- gamSim(6, n=100, scale=2) # Fitting a model and plotting it: mod <- gamm4(y~s(x0)+s(x1)+s(x2), data=dat, random = ~ (1|fac)) plot(mod$gam, pages=1) # Generating some new data for which you'd like predictions: newdat <- data.frame(x0 = runif(100), x1 = runif(100), x2 = runif(100)) # Getting predicted outcomes for new data # These include the splines but ignore other REs predictions = predict(mod$gam, newdata=newdat, se.fit = TRUE) # Consolidating new data and predictions newdat = cbind(newdat, predictions) # If you want CIs newdat <- within(newdat, { lower = fit-1.96*se.fit upper = fit+1.96*se.fit }) # Plot, for example, the predicted outcomes as a function of x1... library(ggplot2) egplot <- ggplot(newdat, aes(x=x1, y=fit)) + geom_smooth() + geom_point() egplot