数据挖掘 - 在计算平均模型精度时避免迭代 - 吾爱随笔录

在计算平均模型精度时避免迭代

数据挖掘 r 准确性交叉验证采样初学者

2021-10-08 04:58:09

我在 R 中拟合模型。

使用createFolds方法k从数据集中创建多个折叠
遍历折叠，在每次迭代中重复以下操作：
- train k-1 折叠上的模型
- predict 第 i 次折叠的结果
- 计算预测精度
平均准确率

R 是否具有自行折叠、重复模型调整/预测并返回平均精度的功能？

1个回答

是的，您可以使用 R 中的 Caret ( http://caret.r-forge.r-project.org/training.html ) 包来完成所有这些工作。例如，

fitControl <- trainControl(## 10-fold CV
                           method = "repeatedcv",
                           number = 10,
                           ## repeated ten times
                           repeats = 10)

gbmFit1 <- train(Class ~ ., data = training,
                 method = "gbm",
                 trControl = fitControl,
                ## This last option is actually one
                ## for gbm() that passes through
                verbose = FALSE)
gbmFit1

这将给出输出

Stochastic Gradient Boosting 

157 samples
 60 predictors
  2 classes: 'M', 'R' 

No pre-processing
Resampling: Cross-Validated (10 fold, repeated 10 times) 

Summary of sample sizes: 142, 142, 140, 142, 142, 141, ... 

Resampling results across tuning parameters:

  interaction.depth  n.trees  Accuracy  Kappa  Accuracy SD  Kappa SD
  1                  50       0.8       0.5    0.1          0.2     
  1                  100      0.8       0.6    0.1          0.2     
  1                  200      0.8       0.6    0.09         0.2     
  2                  50       0.8       0.6    0.1          0.2     
  2                  100      0.8       0.6    0.09         0.2     
  2                  200      0.8       0.6    0.1          0.2     
  3                  50       0.8       0.6    0.09         0.2     
  3                  100      0.8       0.6    0.09         0.2     
  3                  200      0.8       0.6    0.08         0.2     

Tuning parameter 'shrinkage' was held constant at a value of 0.1
Accuracy was used to select the optimal model using  the largest value.
The final values used for the model were n.trees = 150, interaction.depth = 3     
and shrinkage = 0.1.

Caret 还提供了许多其他选项，因此应该能够满足您的需求。

其它你可能感兴趣的问题

上一篇带有 mahout 的推荐引擎下一篇R 聚合（）与日期