数据挖掘 - 混淆矩阵不支持连续变量 - 吾爱随笔录

混淆矩阵不支持连续变量

数据挖掘 Python scikit-学习回归混淆矩阵

2021-10-14 01:50:50

我正在对数据集使用线性回归算法。并尝试计算 y_pred 和 y_test 之间的混淆矩阵。我收到"ValueError : continuous is not supported"错误。

我已经包含了下面的代码。帮助解决这个问题。

x = data_set.iloc[:, :-1].values
y = data_set.iloc[:, 7].values

x_train, x_test, y_train, y_test = train_test_split(x, y, test_size = 0.25, random_state = 0) 

from sklearn.linear_model import LinearRegression
from sklearn.metrics import confusion_matrix

regression = LinearRegression()
regression.fit(x_train, y_train)

y_pred = regression.predict(x_test)

cm = confusion_matrix(y_test, y_pred)

1个回答

混淆矩阵用于告诉您有多少预测被正确或错误地分类。您正在查看一个回归模型，它为您提供连续的输出（而不是分类）。

所以当你运行confusion_matrix(y_test, y_pred)它时，它会抛出ValueError因为它期望类预测，而不是浮点数。

您是在尝试预测类别，还是真的只是一个数字输出？如果不是，那么您不应该使用混淆矩阵。

如果您想为您的值预测例如1或0y，那么您必须将您的线性回归预测转换为这些类中的任何一个。你可以说y_pred上面的任何值0.7都是 a1而下面的任何值都是0。

cutoff = 0.7                              # decide on a cutoff limit
y_pred_classes = np.zeros_like(y_pred)    # initialise a matrix full with zeros
y_pred_classes[y_pred > cutoff] = 1       # add a 1 if the cutoff was breached

你也必须对实际值做同样的事情：

y_test_classes = np.zeros_like(y_pred)
y_test_classes[y_test > cutoff] = 1

现在像以前一样运行混淆矩阵：

confusion_matrix(y_test_classes, y_pred_classes)

这给出了输出：

array([[2, 3],
       [0, 0]])

其它你可能感兴趣的问题

上一篇如何在 Keras 中批量输入一个 numpy 数组下一篇在 keras 自定义损失中获取批量大小