所以我了解随机森林和 GB 方法之间的主要区别。随机森林生成平行树,GB 方法为每次迭代生成一棵树。但是,我对 scikit 的 RF 回归器和 xgboost 的回归器使用的词汇感到困惑。特别是关于调整树/迭代/增强轮数的部分。据我了解,这些术语中的每一个都是相同的。他们根据算法确定决策树应该计算多少次。但是,我应该将它们称为 ntrees 还是 n_estimators?或者我应该简单地为我的 xgboost 使用早期停止回合并仅为我的 rf 调整树的数量?
我的随机森林:
rf = RandomForestRegressor(random_state = 13)
param_grid = dict(model__n_estimators = [250,500,750,1000,1500,2000],
model__max_depth = [5,7,10,12,15,20,25],
model__min_samples_split= [2,5,10],
model__min_samples_leaf= [1,3,5]
)
gs = GridSearchCV(rf
,param_grid = param_grid
,scoring = 'neg_mean_squared_error'
,n_jobs = -1
,cv = 5
,refit = 'neg_mean_squared_error'
)
我的 xgboost
model = XGBRegressor(random_state = 13)
param_grid = dict(model__ntrees = [500,750,1000,1500,2000],
model__max_depth = [1,3,5,7,10],
model__learning_rate= [0.01,0.025,0.05,0.1,0.15,0.2],
model__min_child_weight= [1,3,5,7,10],
model__colsample_bytree=[0.80,1]
)
gs = GridSearchCV(model
,param_grid = param_grid
,scoring = 'neg_mean_squared_error'
,n_jobs = -1
,cv = 5
,refit = 'neg_mean_squared_error'
)