scikit-learn cross_val_score 中的平均误差(非平方)

数据挖掘 scikit-学习 交叉验证
2022-03-07 13:49:18

我需要知道每个折叠生成的值是否cross_val_score具有以零为中心的分布。像中位数或平均值这样简单的东西y_true - y_predicted就足够了。我在可用选项中看到的都是绝对的和平方的。我已经研究过 make scorer 但看不到如何编码简单的平均误差,然后将其称为cross_val_score.

1个回答
from sklearn.datasets import load_diabetes
from sklearn.metrics import make_scorer
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import cross_val_score

def mean_error(y, y_pred):
    # assuming y and y_pred are numpy arrays
    return np.mean(y_pred - y)

X, y = load_diabetes(return_X_y=True)
mean_error_scorer = make_scorer(mean_error, greater_is_better=False)

regr = LinearRegression()
cross_val_score(regr, X, y, scoring=mean_error_scorer)