机器算法验证 - 计算 LibLinear 分类结果的概率 - 吾爱随笔录

计算 LibLinear 分类结果的概率

机器算法验证支持向量机线性模型 libsvm

2022-04-13 08:54:48

我正在将LibLinear用于文档分类任务，我想在其中计算每个预测的正确概率。事实上，在LibLinear中，它确实为逻辑回归提供了概率输出，但不为默认的支持向量分类任务提供概率输出。此外，基于 10 倍交叉验证，逻辑回归比支持向量分类差近 10% 。

那么谁能告诉我，如果我继续使用支持向量分类的解决方案，有没有一种方法可以独立于程序来计算概率？

2个回答

您可以使用 sigmoid 函数将 SVM 决策值转换为介于 0 和 1 之间的数字，可以视为概率。您可以根据您的数据调整参数和 $f(d) = \frac{1}{1 + e^{-\alpha(d-\beta)}}$ $d = (w, x) + b$ $\alpha$ $\beta$

有关更详细的方法，请参阅这些论文：

B.Zadrozny, C. Elkan，将分类器分数转换为准确的多类概率估计。
J.Drish，从支持向量机获得校准的概率估计。

至少在 R 中，只有两种算法提供了 LiblineaR 接口中的概率。

这是实际库的常见问题解答：

Q: How do I choose the solver? Should I use logistic regression or linear SVM? How about L1/L2 regularization?
Generally we recommend linear SVM as its training is faster and the accuracy is competitive. However, if you would like to have probability outputs, you may consider logistic regression.

Moreover, try L2 regularization first unless you need a sparse model. For most cases, L1 regularization does not give higher accuracy but may be slightly slower in training.

Among L2-regularized SVM solvers, try the default one (L2-loss SVC dual) first. If it is too slow, use the option -s 2 to solve the primal problem.

似乎 svm 算法没有提供概率作为输出。

其它你可能感兴趣的问题

上一篇指定窗口内独立事件的概率下一篇测试一组数据点的 (x,y) 是否显着大于另一组数据点的 (x,y)