数据挖掘 - 如何计算不是严格二进制的序列的精度和准确度？ - 吾爱随笔录

如何计算不是严格二进制的序列的精度和准确度？

数据挖掘机器学习 Python

2022-02-26 22:47:44

给定一个预测序列和实际序列，我想计算它的精度和准确度，例如：请注意，这些序列将只包含0, 1 or -1

预测序列：-1,0,1,1,-1,0,1,1,0,-1

实际顺序： -1,1,0,1,-1,1,0,1,0,-1

我知道精度是使用 this 计算的tp/tp+fp，而精度是使用tp + tn /tp + tn + fp + fn. 但是因为我有-1它，所以我不确定如何计算真正的正面？我的理解是，如果我预测 a1并且其对应的实际值为a ，则为真阳性1。对精确度和准确性的计算进行演练会有所帮助。

2个回答

欢迎来到本站！

我们知道这个问题是多类分类问题。

要获得相同的混淆矩阵，您可以使用以下命令：

从 mlxtend.evaluate 导入混淆矩阵

#import the required packages
from mlxtend.evaluate import confusion_matrix 
from mlxtend.evaluate import plot_confusion_matrix

#Actual Target Values
y_target =    [-1,1,0,1,-1,1,0,1,0,-1]
#Predicted Values
y_predicted = [-1,0,1,1,-1,0,1,1,0,-1]

#creation of confusion matrix
cm = confusion_matrix(y_target=y_target, 
                      y_predicted=y_predicted, 
                      binary=False)
#to print the calculated values  of Confusion Matrix
cm

结果：

array([[3, 0, 0],
   [0, 1, 2],
   [0, 2, 2]])

要可视化 cm，您可以使用以下命令：

fig, ax = plot_confusion_matrix(conf_mat=cm)
plt.show()

您可以通过此链接更好地了解mlextend。

您可以使用以下公式获得 Precision 和 Accuracy 值：

$\text{Precision}_{~i} = \cfrac{M_{ii}}{\sum_j M_{kji}}$

$\text{Recall}_{~i} = \cfrac{M_{ii}}{\sum_j M_{ijk}}$

浏览这些Link-1，Link-2以更好地理解如何计算它们，在Link-3中是 GitHub 链接，它解释了它们如何实现一维数组，看看你可以尝试扩展它你的结果。

免责声明，

您好，您还可以使用PyCM，它是一个 python 模块，它不仅使用精度和召回率，还使用各种指标来评估分类器的性能。

您可以通过以下命令使用它

>>> from pycm import ConfusionMatrix
>>> cm1=ConfusionMatrix(y_target,y_predicted)
>>> print(cm1)
Predict          -1    0     1     
Actual
-1               3     0     0     
0                0     1     2     
1                0     2     2     
Overall Statistics : 
95% CI                                                           (0.29636,0.90364)
AUNP                                                             0.69048
AUNU                                                             0.70238
Bennett S                                                        0.4
CBA                                                              0.61111
Chi-Squared                                                      10.27778
Chi-Squared DF                                                   4
Conditional Entropy                                              0.67549
Cramer V                                                         0.71686
Cross Entropy                                                    1.57095
Gwet AC1                                                         0.40299
Hamming Loss                                                     0.4
Joint Entropy                                                    2.24644
KL Divergence                                                    0.0
Kappa                                                            0.39394
Kappa 95% CI                                                     (-0.06612,0.854)
Kappa No Prevalence                                              0.2
Kappa Standard Error                                             0.23473
Kappa Unbiased                                                   0.39394
Lambda A                                                         0.5
Lambda B                                                         0.5
Mutual Information                                               0.89546
NIR                                                              0.4
Overall ACC                                                      0.6
Overall CEN                                                      0.3585
Overall J                                                        (1.53333,0.51111)
Overall MCC                                                      0.39394
Overall MCEN                                                     0.41527
Overall RACC                                                     0.34
Overall RACCU                                                    0.34
P-Value                                                          0.16624
PPV Macro                                                        0.61111
PPV Micro                                                        0.6
Phi-Squared                                                      1.02778
RCI                                                              0.57001
RR                                                               3.33333
Reference Entropy                                                1.57095
Response Entropy                                                 1.57095
SOA1(Landis & Koch)                                              Fair
SOA2(Fleiss)                                                     Poor
SOA3(Altman)                                                     Fair
SOA4(Cicchetti)                                                  Poor
Scott PI                                                         0.39394
Standard Error                                                   0.15492
TPR Macro                                                        0.61111
TPR Micro                                                        0.6
Zero-one Loss                                                    4
Class Statistics :
Classes                                                          -1                      0                       1                       
ACC(Accuracy)                                                    1.0                     0.6                     0.6                     
AUC(Area under the roc curve)                                    1.0                     0.52381                 0.58333                 
AUCI(Auc value interpretation)                                   Excellent               Poor                    Poor                    
BM(Informedness or bookmaker informedness)                       1.0                     0.04762                 0.16667                 
CEN(Confusion entropy)                                           0                       0.52832                 0.5                     
DOR(Diagnostic odds ratio)                                       None                    1.25                    2.0                     
DP(Discriminant power)                                           None                    0.05343                 0.16597                 
DPI(Discriminant power interpretation)                           None                    Poor                    Poor                    
ERR(Error rate)                                                  0.0                     0.4                     0.4                     
F0.5(F0.5 score)                                                 1.0                     0.33333                 0.5                     
F1(F1 score - harmonic mean of precision and sensitivity)        1.0                     0.33333                 0.5                     
F2(F2 score)                                                     1.0                     0.33333                 0.5                     
FDR(False discovery rate)                                        0.0                     0.66667                 0.5                     
FN(False negative/miss/type 2 error)                             0                       2                       2                       
FNR(Miss rate or false negative rate)                            0.0                     0.66667                 0.5                     
FOR(False omission rate)                                         0.0                     0.28571                 0.33333                 
FP(False positive/type 1 error/false alarm)                      0                       2                       2                       
FPR(Fall-out or false positive rate)                             0.0                     0.28571                 0.33333                 
G(G-measure geometric mean of precision and sensitivity)         1.0                     0.33333                 0.5                     
IS(Information score)                                            1.73697                 0.152                   0.32193                 
J(Jaccard index)                                                 1.0                     0.2                     0.33333                 
MCC(Matthews correlation coefficient)                            1.0                     0.04762                 0.16667                 
MCEN(Modified confusion entropy)                                 0                       0.52877                 0.52832                 
MK(Markedness)                                                   1.0                     0.04762                 0.16667                 
N(Condition negative)                                            7                       7                       6                       
NLR(Negative likelihood ratio)                                   0.0                     0.93333                 0.75                    
NPV(Negative predictive value)                                   1.0                     0.71429                 0.66667                 
P(Condition positive or support)                                 3                       3                       4                       
PLR(Positive likelihood ratio)                                   None                    1.16667                 1.5                     
PLRI(Positive likelihood ratio interpretation)                   None                    Poor                    Poor                    
POP(Population)                                                  10                      10                      10                      
PPV(Precision or positive predictive value)                      1.0                     0.33333                 0.5                     
PRE(Prevalence)                                                  0.3                     0.3                     0.4                     
RACC(Random accuracy)                                            0.09                    0.09                    0.16                    
RACCU(Random accuracy unbiased)                                  0.09                    0.09                    0.16                    
TN(True negative/correct rejection)                              7                       5                       4                       
TNR(Specificity or true negative rate)                           1.0                     0.71429                 0.66667                 
TON(Test outcome negative)                                       7                       7                       6                       
TOP(Test outcome positive)                                       3                       3                       4                       
TP(True positive/hit)                                            3                       1                       2                       
TPR(Sensitivity, recall, hit rate, or true positive rate)        1.0                     0.33333                 0.5                     
Y(Youden index)                                                  1.0                     0.04762                 0.16667                 
dInd(Distance index)                                             0.0                     0.72531                 0.60093                 
sInd(Similarity index)                                           1.0                     0.48713                 0.57508

其它你可能感兴趣的问题

上一篇获取 TensorFlow 的概率下一篇多标签分类的多任务学习架构