我正在尝试使用 R 中的 xgboost 来控制过拟合,eta但是当我将 xgb.cv 读数的过拟合与 xgb.train 读数进行比较时,我不知道为什么xgb.cv似乎并没有过拟合xgb.train。我怎样才能得到 in 的同样好的向下mlogloss进展xgb.train?在运行模型之前,我已经平衡了我的课程。
[1] "########### i is 1 and j 1 ##################"
[1] "Creating cv..."
# this part is good -------------------
[0] train-mlogloss:1.609325+0.000006 test-mlogloss:1.609315+0.000009
[100] train-mlogloss:1.601508+0.001238 test-mlogloss:1.602480+0.001071
[200] train-mlogloss:1.594359+0.002151 test-mlogloss:1.596278+0.001812
[300] train-mlogloss:1.587120+0.002100 test-mlogloss:1.589944+0.001546
[400] train-mlogloss:1.580558+0.001839 test-mlogloss:1.584062+0.001251
[1] "Took 160 seconds to cv train with 500 rounds..."
[1] "Creating model..."
# this part is bad -------------------
[0] train-mlogloss:1.609341 test-mlogloss:1.609383
[100] train-mlogloss:1.602439 test-mlogloss:1.609435
[200] train-mlogloss:1.594991 test-mlogloss:1.609580
[300] train-mlogloss:1.587814 test-mlogloss:1.609732
我的代码cv和train我的参数是:
param = list("objective" = "multi:softprob"
, "eval_metric" = "mlogloss"
, 'num_class' = 5
, 'eta' = 0.001)
bst.cv = xgb.cv(param = param
, data = ce.dmatrix
, nrounds = nrounds
, nfold = 4
, stratified = T
, print.every.n = 100
, watchlist = watchlist
, early.stop.round = 10
)
bst = xgb.train(param = param
, data = ce.dmatrix
, nrounds = nrounds
, print.every.n = 100
, watchlist = watchlist
# , early.stop.round = 10
)