我预测使用 scikit-learn 编码的 10 个类别标签,有 6 个因素,120 万个案例。DecisionTreeClassifier RandomForestClassifier ExtraTreesClassifier 提供 0.9 的准确度(以及精确度和召回率)
AdaBoostClassifier GradientBoostingClassifier 的精度为 0.2
关于巨大差异的任何指示?
(我正在做gridsearchcv)。代码:
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=42)
def output_metrics():
from sklearn.metrics import accuracy_score, precision_score, recall_score
print("Accuracy:",accuracy_score(y_test, y_pred))
print('Precision', precision_score(y_test, y_pred, average=None).mean())
print('Recall', recall_score(y_test, y_pred, average=None).mean())
from sklearn.ensemble import AdaBoostClassifier
from sklearn.model_selection import GridSearchCV
tree_para = { 'n_estimators': [16, 32] }
clf = GridSearchCV(AdaBoostClassifier(), tree_para, cv=5)
model= clf.fit(X_train, y_train)
y_pred = clf.predict(X_test)
output_metrics()