数据挖掘 - 逻辑回归成本函数 - 吾爱随笔录 - 问答

逻辑回归成本函数

数据挖掘机器学习分类预测建模逻辑回归成本函数

2021-09-27 23:35:33

在 Aurelien Geron 的书中，我发现了这一行

This cost function makes sense because –log(t) grows very large when t approaches 0, so the cost will be large if the model estimates a probability close to 0 for a positive instance, and it will also be very large if the model estimates a probability close to 1 for a negative instance. On the other hand, – log(t) is close to 0 when t is close to 1, so the cost will be close to 0 if the estimated probability is close to 0 for a negative instance or close to 1 for a positive instance, which is precisely what we want.

我没有得到的是，如果模型估计正实例的概率接近 0，成本会很大，如果模型估计负实例的概率接近 1，成本也会很大？

2个回答

通过最大似然估计导出的 Logistic 回归的成本函数：

如果 y = 1（正）： i）如果预测正确（即 h=1），成本 = 0，ii）成本 $\rightarrow \infty$ 如果 $h_{\theta}(x)\rightarrow 0$ .
如果 y = 0（负数）： i）如果预测正确，成本 = 0（即 h=0），ii）成本 $\rightarrow \infty$ 如果 $(1-h_{\theta}(x))\rightarrow 0$ .

直觉是，更大的错误应该得到更大的惩罚。进一步阅读，1 , 2 , 3 , 4。

不要试图过度简化答案，而只需使用计算器手动计算这些，您就可以看到这一点：

如果t接近 1，我们就说这个例子是 0.9999，那么：

- l o g (t) = - l o g (0.9999) = 0.000100005

$-log(t) = -log(0.9999) = 0.000100005$

反过来，

如果t接近 0，我们就说这个例子是 0.0001，那么：

- l o g (t) = - l o g (0.0001) = 9.21034

$-log(t) = -log(0.0001) = 9.21034$

因此，如果概率高，成本函数返回一个小数，但如果概率低，成本函数返回一个（相对）大的数字。

也许我错过了你的问题的重点，在这种情况下，我道歉。

其它你可能感兴趣的问题

上一篇如何在 SVM 分类中设置超参数下一篇L2正则化增加深度学习模型的损失率