Error in XGBoost: Error in contrasts<-(tmp, value = contr.funs[1 + isOF[nn]]):
contrasts can be applied only to factors with 2 or more levels
param <- list( objective = "binary:logistic",
booster = "gbtree",
eval_metric = "auc",
eta = 0.01,
max_depth = 5,
subsample = 0.6,
colsample_bytree = 0.6
)
t1.y <- t1$target
t1 <- sparse.model.matrix(target ~ ., data = t1)
#Error occurs after the above command. If i change the above statement to (replacing . by -1 to drop collinear columns)
t1 <- sparse.model.matrix(target ~ -1, data = t1)
更改为 -1 后不会发生错误,但生成的 dgcMatrix 看起来很可疑,因为它只包含一列和很多行。我期望原始矩阵的列是单热编码的,而不是得到一列作为输出。如果我继续忽略对生成的稀疏矩阵的检查,那么在运行下面的 xgboost 后会出现错误:
Error in xgb.iter.update(bst$handle, dtrain, iteration - 1, obj) :
[05:14:55] amalgamation/../src/objective/regression_obj.cc:108: label must be in [0,1] for logistic regression
dtrain <- xgb.DMatrix(data=t1, label=t1.y)
watchlist <- list(train=dtrain)
bst1 <- xgboost( params = param,
data = dtrain,
nrounds = 1750,
#watchlist = watchlist,
verbose = 1,
maximize = FALSE
)
我尝试将目标列转换为因子和数字(将其转换为 0 和 1,因为 as.numeric 将其转换为 1 和 2)。但这并没有帮助:'(
当我将目标转换为数字然后运行 Xgboost 后,我收到以下错误:Error in xgb.iter.update(bst$handle, dtrain, iteration - 1, obj) :
[05:20:12] amalgamation/../src/tree/updater_colmaker.cc:162: Check failed: (n) > (0) colsample_bytree=0.6 is too small that no feature can be included
*t1 是一个data.frame,最初包含因子、数字和整数,我将其转换为稀疏矩阵
谢谢您的帮助!