使用决策树算法训练时,我的测试集准确率达到 100%。但随机森林的准确率只有 85%
我的模型有问题还是决策树最适合提供的数据集。
代码:
from sklearn.model_selection import train_test_split
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.20)
#Random Forest
from sklearn.ensemble import RandomForestClassifier
rf = RandomForestClassifier(n_estimators = 1000, random_state = 42)
rf.fit(x_train, y_train);
predictions = rf.predict(x_test)
cm = sklearn.metrics.confusion_matrix(y_test,predictions)
print(cm)
#Decision Tree
from sklearn import tree
clf = tree.DecisionTreeClassifier()
clf = clf.fit(x_train, y_train)
predictions = clf.predict(x_test)
cm = sklearn.metrics.confusion_matrix(y_test,predictions)
混淆矩阵:
随机森林:
[[19937 1]
[ 8 52]]
决策树:
[[19938 0]
[ 0 60]]