数据挖掘 - 多标签计算类权重 - 不可散列的类型 - 吾爱随笔录

使用 Keras、sklearn 等在我的神经网络中处理具有13 个可能输出的多标签分类问题......

每个输出可以是一个数组，如 [0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 0, 1 ,0]。

我有一个不平衡数据集，我尝试应用 compute_class_weight 方法，例如：

class_weight = compute_class_weight('balanced', np.unique(Y_train), Y_train)

当我尝试运行我的代码时，我得到了Unhashable Type: 'numpy.ndarray':

Traceback (most recent call last):
 File "main.py", line 115, in <module>
   train(dataset, labels)
 File "main.py", line 66, in train
   class_weight = compute_class_weight('balanced', np.unique(Y_train), Y_train)
 File "/home/python-env/env/lib/python3.6/site-packages/sklearn/utils/class_weight.py", line 41, in compute_class_weight
  if set(y) - set(classes):
  TypeError: unhashable type: 'numpy.ndarray'

我知道那是因为我使用数组，已经尝试添加一些字典，

IE：

class_weight_dict = dict(enumerate(np.unique(y_train), class_weight))

好吧，我不知道该怎么办，尝试了其他策略，但没有成功......有什么想法吗？

提前致谢！