Keras categorical_crossentropy 损失(和准确性)

数据挖掘 喀拉斯 分类数据 参考请求 损失函数
2021-10-05 12:30:58

在使用keras训练神经网络以进行 categorical_crossentropy 损失时,损失究竟是如何定义的?我希望它是所有样本的平均值

loss(ptrue,ppredict)=ipitruelogpipredict
但在文档代码 中都找不到明确的答案一个权威的参考是可取的。

查看代码我不确定计算是否委托给tensorflow / theano

(关于准确性有一个类似的问题;代码更清晰,但我没有看到对 mean() 的调用?)

PS。从这段代码中,看起来损失和准确率是在最后一个训练时期之前计算loss_and_acc(...) keras 版本 2.0.4,tensorflow 和 theano 后端的结果相同)。

#!/usr/bin/python3

import numpy as np
from numpy.random import randint, seed

from keras import __version__ as keras_version
from keras.models import Sequential
from keras.layers import Dense

N = 4 # Classes
S = 10 # Samples

nn = Sequential()
nn.add(Dense(input_dim=1, units=N, kernel_initializer='normal', activation='softmax'))
nn.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

seed(7)
X = np.random.random((S, 1))
Y = np.vstack([np.eye(1, N, k=randint(0, N)) for _ in range(S)])
#for (x, y) in zip(X, Y) : print(x, y)

def loss_and_acc(NN, X, Y) :
    loss = []
    acc  = []
    for (p, q) in zip(Y, NN.predict(X)) :
        loss += [ -sum(a*np.log(b) for (a, b) in zip(p, q) if (b != 0)) ]
        acc  += [ np.argmax(p) == np.argmax(q) ]
    return (np.mean(loss), np.mean(acc))

print("Keras version: ", keras_version)

for _ in range(10) :
    print("Before:  loss = {}, acc = {}".format(*loss_and_acc(nn, X, Y)))
    H = nn.fit(X, Y, epochs=1, verbose=0).history
    print("History: loss = {}, acc = {}".format(H['loss'][-1], H['acc'][-1]))

输出:

Using Theano backend.
Keras version:  2.0.4
Before:  loss = 1.3843669414520263, acc = 0.2
History: loss = 1.3843669891357422, acc = 0.20000000298023224
Before:  loss = 1.3834303855895995, acc = 0.2
History: loss = 1.3834303617477417, acc = 0.20000000298023224
Before:  loss = 1.3824962615966796, acc = 0.3
History: loss = 1.3824962377548218, acc = 0.30000001192092896
Before:  loss = 1.381564486026764, acc = 0.3
History: loss = 1.3815644979476929, acc = 0.30000001192092896
Before:  loss = 1.380635154247284, acc = 0.3
History: loss = 1.380635142326355, acc = 0.30000001192092896
Before:  loss = 1.3797082901000977, acc = 0.3
History: loss = 1.3797082901000977, acc = 0.30000001192092896
Before:  loss = 1.378783941268921, acc = 0.2
History: loss = 1.378783941268921, acc = 0.20000000298023224
Before:  loss = 1.3778621554374695, acc = 0.2
History: loss = 1.3778622150421143, acc = 0.20000000298023224
Before:  loss = 1.3769428968429565, acc = 0.2
History: loss = 1.3769428730010986, acc = 0.20000000298023224
Before:  loss = 1.3760262489318849, acc = 0.3
History: loss = 1.3760262727737427, acc = 0.30000001192092896
1个回答

我正在使用带有 tensorflow 后端的 keras。我检查了,keras 中的categorical_crossentropy损失定义为您定义的。这是代码的一部分(不是整个函数定义)-

def categorical_crossentropy(target, output, from_logits=False, axis=-1):
    if not from_logits:
        # scale preds so that the class probas of each sample sum to 1
        output /= tf.reduce_sum(output, axis, True)
        # manual computation of crossentropy
        _epsilon = _to_tensor(epsilon(), output.dtype.base_dtype)
        output = tf.clip_by_value(output, _epsilon, 1. - _epsilon)
    return - tf.reduce_sum(target * tf.log(output), axis) 

正如您在最后一行中看到的,它返回每个观察值的真值乘积和输出值对数的总和。您可以在第 3176 行找到完整的函数定义

对于 theano 后端,它应该是相同的。您可以在第 1622 行查看此处