数据挖掘 - AUC分数相似时如何在模型之间进行选择？ - 吾爱随笔录

AUC分数相似时如何在模型之间进行选择？

数据挖掘机器学习数据挖掘公制

2022-02-28 17:36:34

我使用两种机器学习算法进行二进制分类，我得到了这个结果：

算法 1：

 AUC- Train : 0.75      AUC- Test: 0.65          big Train / overfitting

算法 2：

 AUC- Train : 0.72      AUC- Test: 0.65          small train / small overfitting

哪一个更好？

3个回答

根据 AUC 分数，它们是相同的。模型是否过拟合并不重要。重要的是它在新数据（测试分数）上的表现如何。

过度拟合只是表明通过使您的模型更通用可能会有改进的空间。但是在测试分数增加之前，模型并没有改善，即使它的过度拟合较少。

算法 2

在相等的测试分数之间选择训练和测试分数之间差异较小的那一个（算法 2），因为具有更好训练分数的那一个（算法 1）更过度拟合。只有当它具有主观上更好的测试分数时，我们才会容忍一个更过度拟合的模型。

为了更好的理由，想想我们如何训练神经网络。当验证分数停止提高时，即使训练分数会不断提高，我们也会停止训练过程。如果我们让训练继续下去，模型将开始基于没有被评论家（验证集）审查的训练集做出额外的假设，这使得模型更容易对数据建立错误的假设。

出于同样的原因，基于critic（测试集）具有相同性能但在训练集上表现更好的模型（Algo 1）很容易对数据做出未经测试的假设。

仅基于此指标，您无法找到哪个更好，因为 AUC 无法区分这两个结果。您应该使用其他一些指标，例如 Kappa 或一些基准。

免责声明：

如果您使用的是 Python，我建议您使用PyCM模块，该模块将您的混淆矩阵作为输入并计算大约 100 个整体和基于类的指标。

首先使用这个模块准备你的混淆矩阵，并通过以下代码查看它的推荐参数：

>>> from pycm import *

>>> cm = ConfusionMatrix(matrix={"0": {"0": 1, "1":0, "2": 0}, "1": {"0": 0, "1": 1, "2": 2}, "2": {"0": 0, "1": 1, "2": 0}})  

>>> print(cm.recommended_list)
["Kappa", "SOA1(Landis & Koch)", "SOA2(Fleiss)", "SOA3(Altman)", "SOA4(Cicchetti)", "CEN", "MCEN", "MCC", "J", "Overall J", "Overall MCC", "Overall CEN", "Overall MCEN", "AUC", "AUCI", "G", "DP", "DPI", "GI"]

然后通过以下代码查看关注推荐指标的指标值：

>>> print(cm)
    Predict          0        1        2        
    Actual
    0                1        0        0        
    1                0        1        2        
    2                0        1        0        




Overall Statistics : 

95% CI                                                           (-0.02941,0.82941)
Bennett_S                                                        0.1
Chi-Squared                                                      6.66667
Chi-Squared DF                                                   4
Conditional Entropy                                              0.55098
Cramer_V                                                         0.8165
Cross Entropy                                                    1.52193
Gwet_AC1                                                         0.13043
Joint Entropy                                                    1.92193
KL Divergence                                                    0.15098
Kappa                                                            0.0625
Kappa 95% CI                                                     (-0.60846,0.73346)
Kappa No Prevalence                                              -0.2
Kappa Standard Error                                             0.34233
Kappa Unbiased                                                   0.03226
Lambda A                                                         0.5
Lambda B                                                         0.66667
Mutual Information                                               0.97095
Overall_ACC                                                      0.4
Overall_RACC                                                     0.36
Overall_RACCU                                                    0.38
PPV_Macro                                                        0.5
PPV_Micro                                                        0.4
Phi-Squared                                                      1.33333
Reference Entropy                                                1.37095
Response Entropy                                                 1.52193
Scott_PI                                                         0.03226
Standard Error                                                   0.21909
Strength_Of_Agreement(Altman)                                    Poor
Strength_Of_Agreement(Cicchetti)                                 Poor
Strength_Of_Agreement(Fleiss)                                    Poor
Strength_Of_Agreement(Landis and Koch)                           Slight
TPR_Macro                                                        0.44444
TPR_Micro                                                        0.4

Class Statistics :

Classes                                                          0                       1                       2                       
ACC(Accuracy)                                                    1.0                     0.4                     0.4                     
BM(Informedness or bookmaker informedness)                       1.0                     -0.16667                -0.5                    
DOR(Diagnostic odds ratio)                                       None                    0.5                     0.0                     
ERR(Error rate)                                                  0.0                     0.6                     0.6                     
F0.5(F0.5 score)                                                 1.0                     0.45455                 0.0                     
F1(F1 score - harmonic mean of precision and sensitivity)        1.0                     0.4                     0.0                     
F2(F2 score)                                                     1.0                     0.35714                 0.0                     
FDR(False discovery rate)                                        0.0                     0.5                     1.0                     
FN(False negative/miss/type 2 error)                             0                       2                       1                       
FNR(Miss rate or false negative rate)                            0.0                     0.66667                 1.0                     
FOR(False omission rate)                                         0.0                     0.66667                 0.33333                 
FP(False positive/type 1 error/false alarm)                      0                       1                       2                       
FPR(Fall-out or false positive rate)                             0.0                     0.5                     0.5                     
G(G-measure geometric mean of precision and sensitivity)         1.0                     0.40825                 0.0                     
LR+(Positive likelihood ratio)                                   None                    0.66667                 0.0                     
LR-(Negative likelihood ratio)                                   0.0                     1.33333                 2.0                     
MCC(Matthews correlation coefficient)                            1.0                     -0.16667                -0.40825                
MK(Markedness)                                                   1.0                     -0.16667                -0.33333                
N(Condition negative)                                            4                       2                       4                       
NPV(Negative predictive value)                                   1.0                     0.33333                 0.66667                 
P(Condition positive)                                            1                       3                       1                       
POP(Population)                                                  5                       5                       5                       
PPV(Precision or positive predictive value)                      1.0                     0.5                     0.0                     
PRE(Prevalence)                                                  0.2                     0.6                     0.2                     
RACC(Random accuracy)                                            0.04                    0.24                    0.08                    
RACCU(Random accuracy unbiased)                                  0.04                    0.25                    0.09                    
TN(True negative/correct rejection)                              4                       1                       2                       
TNR(Specificity or true negative rate)                           1.0                     0.5                     0.5                     
TON(Test outcome negative)                                       4                       3                       3                       
TOP(Test outcome positive)                                       1                       2                       2                       
TP(True positive/hit)                                            1                       1                       0                       
TPR(Sensitivity, recall, hit rate, or true positive rate)        1.0                     0.33333                 0.0

其它你可能感兴趣的问题

上一篇此类交叉变量交互图的名称下一篇如何从头开始创建语言翻译器？