机器算法验证 - 交叉熵损失函数的不同定义不等效？ - 吾爱随笔录

在这个问题：不同定义的交叉熵损失函数中，提出了两种不同的交叉熵损失函数定义：

C = - \frac{1}{n} \sum_{x} \sum_{j} (y_{j} \ln a_{j}^{L})

$C = -\frac{1}{n} \sum_x \sum_j(y_j \ln a_{j}^{L})$ 和

C = - \frac{1}{n} \sum_{x} \sum_{j} (y_{j} \ln a_{j}^{L} + (1 - y_{j}) \ln (1 - a_{j}^{L})) .

$C = -\frac{1}{n} \sum_x \sum_j (y_j \ln a_{j}^{L} + (1-y_j) \ln(1-a_{j}^{L})).$

我提到的问题的答案中的分析表明，对于二元分类(j=2)，假设并且是一个单热向量，它认为： $\sum_j a_j = 1$ $y$

C = - \frac{1}{n} \sum_{x} \sum_{j = 1}^{2} (y_{j} \ln a_{j}) = - \frac{1}{n} \sum_{x} y_{1} \ln a_{1} + y_{2} \ln a_{2} = - \frac{1}{n} \sum_{x} y_{1} \ln a_{1} + (1 - y_{1}) \ln (1 - a_{1}) .

$C = -\frac{1}{n} \sum_x \sum_{j=1}^2 (y_j \ln a_j) = -\frac{1}{n}\sum_x y_1\ln a_1 + y_2 \ln a_2 = \\ -\frac{1}{n} \sum_x y_1 \ln a_1 + (1 - y_1) \ln (1 - a_1).$

但是，我看不出这个分析如何表明这两个定义在假设的情况下是等价的，因为在第二个定义中，如果我们取，它会产生： $j=2$

C = - \frac{1}{n} \sum_{x} [y_{1} \ln a_{1}^{L} + (1 - y_{1}) \ln (1 - a_{1}^{L}) + y_{2} \ln a_{2}^{L} + (1 - y_{2}) \ln (1 - a_{2}^{L})] .

$C = -\frac{1}{n} \sum_x [y_1 \ln a_1^L + (1 - y_1) \ln (1 - a_1^L) + y_2 \ln a_2^L + (1 - y_2) \ln (1 - a_2^L)].$

此外，我想证明这些定义对于任意数量的输出神经元都是等价的，而不仅仅是 2 个神经元。