sklearn“balanced_accuracy_score”sample_weights 不起作用

数据挖掘 scikit-学习 准确性 公制
2022-02-23 14:20:57

我想要一个度量标准,我可以在测量“总准确度”时根据需要对每个类进行权衡。sklearn好像有这个balanced_accuracy_score无论 sample_weight 如何,我都会得到相同的“平衡精度”。为什么?有什么意义sample_weights

from sklearn.metrics import balanced_accuracy_score

sample_weight = np.array([1 if i == 0 else 1000 for i in y])
balanced_accuracy_score(y,m.predict(xs),sample_weight=sample_weight)

这是文档

1个回答

的重点sample_weights是赋予特定样本权重(例如,通过它们的重要性或确定性);不是特定的类。
显然,“平衡精度”是(来自用户指南):

每类召回分数的宏观平均值

因此,由于分数是跨班级平均的 - 只有班级内的权重很重要,而不是班级之间......并且您的权重在班级内是相同的,并且仅在班级之间变化。

明确地(再次来自用户指南):

w^i=wij1(yj=yi)wj

即第i个样本通过将其权重除以具有相同标签的样本的总权重来重新加权。

现在,如果您愿意,您可以使用简单的准确度分数,并根据需要插入权重。

在以下示例中:

from sklearn.metrics import balanced_accuracy_score, accuracy_score

y_true = [0, 1, 0, 0, 1, 0, 1, 1, 1, 1]
y_pred = [0, 1, 0, 0, 0, 1, 1, 1, 1, 1]

some_sample_weights =[10, 1, 1, 1, 10, 1, 0.5, 0.5, 0.5, 0.5]
weights_by_class =[1 if y==1 else 1000 for y in y_true]

print('with some weights: {:.2f}'.format(balanced_accuracy_score(y_true, y_pred, sample_weight=some_sample_weights)))
print('without weights: {:.2f}'.format(balanced_accuracy_score(y_true, y_pred)))
print('with class weights in balanced accuracy score: {:.2f}'.format(balanced_accuracy_score(y_true, y_pred, sample_weight=weights_by_class)))
print('with class weights in accuracy score: {:.5f}'.format(accuracy_score(y_true, y_pred, sample_weight=weights_by_class)))

class_sizes = [sum((1 for y in y_true if y==x))/len(y_true) for x in (0,1)] 
weights_by_class_manually_balanced = [w/class_sizes[y] for w, y in zip(weights_by_class, y_true)]

print('with class weights in accuracy score (manually balanced): {:.5f}'.format(accuracy_score(y_true, y_pred, sample_weight=weights_by_class_manually_balanced)))

你得到:

with some weights: 0.58
without weights: 0.79
with class weights in balanced accuracy score: 0.79
with class weights in accuracy score: 0.75012
with class weights in accuracy score (manually balanced): 0.75008

如你看到的:

  • 在平衡准确度得分中使用类别权重并不重要;他们刚刚调整回班级规模。
  • 在准确度分数中使用类权重非常接近 75%(4 个0标签中有 3 个被正确分类),并且根据类大小重新调整权重并不重要(准确度有点低,因为班级0更大)