xgboost 自定义损失的预测问题

数据挖掘 机器学习 xgboost 预言
2022-03-09 20:14:35

我对 xgboost 自定义目标有疑问:我无法获得一致的预测。换句话说,我的预测规模与我想要预测的值不一致。我尝试了许多自定义损失,但我总是遇到同样的问题。

import numpy as np
import pandas as pd
import xgboost as xgb
from sklearn.datasets import make_regression

n_samples_train = 500
n_samples_test = 100
n_features = 200

X, y = make_regression(n_samples_train, n_features,noise=10)
X_test, y_test = make_regression(n_samples_test, n_features,noise=10)

param = {'verbosity' : 1,
      'max_depth' : 12,
      'learning_rate' : 0.01,
      'nthread' : 3,
        }

dtrain = xgb.DMatrix(X, y)

best_nrounds = 50

bst_reglinear = xgb.train(param, 
                      dtrain, 
                      best_nrounds)


def reg_obj(preds,dtrain):
    y = dtrain.get_label()
    N = len(y)
    #residual = (preds-y).astype("float")
    grad = 2*preds-y
    hess = 2*N*np.ones(len(y))
    return grad, hess

bst_custom = xgb.train(param,
                   dtrain,
                   best_nrounds,
                   obj = reg_obj)



dtest = xgb.DMatrix(X_test)


pred = bst_reglinear.predict(dtest)
print(np.abs(pred).mean())

pred_custom = bst_custom.predict(dtest)
print(np.abs(pred_custom).mean())
1个回答

我认为您的 grad 和 hess 出错了,请尝试:

grad = 2*(preds-y)
hess = 2*np.ones(len(y))