数据挖掘 - Keras 中的不平衡二进制数据集。在拟合 st 灵敏度和特异性最大化后找到最佳阈值？ - 吾爱随笔录

我在 Keras 中制作了一个适用于不平衡二进制数据集的 ANN。在拟合模型后使用数据来预测二元类，我想选择一个阈值 st 灵敏度和特异性最大化。

这是我现在正在使用的代码，从 0-1 遍历所有阈值，并使用 G-mean 分数找到最佳阈值。

predictions = model_p.predict(Xt)
thresholds = arange(0, 1, 0.001)
threshold = -1
best_Gscore = 0
false_positive = 0
true_positive = 0
false_negative = 0
true_negative = 0

    for z in thresholds:
        print("Threshold => %f " % (z))
        fp = 0
        fn = 0
        tp = 0
        tn = 0
        for i in range(len(yt)):
            if( yt[i] == 0 and predictions[i] > z ):
                fp += 1
            elif( yt[i] == 1 and predictions[i] > z ):
                tp += 1
            elif( yt[i] == 1 and predictions[i] <= z ):
                fn += 1
            elif( yt[i] == 0 and predictions[i] <= z ):
                tn += 1
        
        if( (tp+fn) == 0):
            continue
        if( (tn+fp) == 0):
            continue
        TPR = fp / (fp + tn)
        #sens = tp / (tp + fn)
        #spec = tn / (tn + fp)
        FPR = tp / (tp + fn)
        Gscore = math.sqrt(TPR*(1-FPR))

        print("J Stat => %f " % (Gscore), flush=True)

        if( Gscore > best_Gscore ):
            best_Gscore = Gscore
            false_positive = fp
            false_negative = fn
            true_positive = tp
            true_negative = tn
            threshold = z

但是有没有更好的方法来最大化感官和规格？也许找到一种感觉和规格，例如

| sens - spec | < 0.05 and sens*spec > score_max

然后，一旦找到这个 score_max，您就可以在两者上运行较小的跳跃，例如 +- 0.2？还是有另一种方法可以找到灵敏度和特异性最大值？