机器算法验证 - 模型堆叠算法 - 吾爱随笔录

我正在尝试堆叠方法，看看它是否能改善我的结果，但在使用一些 R 包之前，我决定自己编写代码。这是我正在做的伪代码：

train.all = getTrain()

# separate 20% of data to test the stacked model
test.meta.idx = sample(nrow(train.all), floor(nrow(train.all)*0.2))
test.meta = train.all[test.meta.idx, ]

# remove these from train.all
train.all = train.all[-test.meta.idx, ]

# generate folds for cross-validation
k = 10
folds = generateFolds(k)

# dataset to store base learners predictions
train.meta = data.frame()

for (i in 1:k) {
   train.idx = folds[[1]]$train
   test.idx = folds[[i]]$test

   train = train.all[train.idx, ]
   test = train.all[test.idx, ]

   # train models
   model1 = fitmodel1(formula, train)
   model2 = fitmodel2(formula, train)
   model3 = fitmodel3(formula, train)

  # get model outputs
  y1 = predict(model1, test)
  y2 = predict(model2, test)
  y3 = predict(model3, test)

  y.obs = test$y

  # append to meta train.meta
  train.meta = rbind(train.meta, c(y.obs, y1, y2, y3))
}

现在我可以使用 train.meta 来拟合不同的模型，这将根据来自模型 1、模型 2 和模型 3 预测的输入给出最终结果。但是，我该如何测试呢？对于每个折叠适合不同的模型 1、模型 2 和模型 3，所以我将有 10 个不同的模型 1、模型 2 和模型 3。

我应该使用整个训练数据重新调整基础学习器吗？
可以使用拟合的基础学习器值来训练元模型吗？

感谢您的任何建议！