在尝试训练 XGBoost 逻辑分类器时,我的 Jupyter notebook 的 python 内核一直死机。以前,我已经成功运行了以下所有代码。目前,有问题。首先,我将向您展示我能够成功运行的代码块:
import xgboost as xgb
xgtrain = xgb.DMatrix(data = X_train_sub.values, label = Y_train.values) # create dense matrix of training values
xgtest = xgb.DMatrix(data = X_test_sub.values, label = Y_test.values) # create dense matrix of test values
param = {'max_depth':2, 'eta':1, 'silent':1, 'objective':'binary:logistic'} # specify parameters via map
我的数据很小:
X_train_imp_sub.shape
(1365, 18)
但是,我笔记本的内核一直死在这个块上:
xgmodel = xgb.train(param, xgtrain, num_boost_round = 2) # train the model
predictions = xgmodel.predict(xgtest) # make prediction
from sklearn.metrics import accuracy_score
accuracy = accuracy_score(y_true = Y_test,
y_pred = predictions.round(),
normalize = True) # If False, return # of correctly classified samples. Else, return fraction of correctly classified samples
print("Accuracy: %.2f%%" % (accuracy * 100.0))
如前所述,我之前已经能够成功运行这两个块。我已经关闭了所有其他笔记本并重新启动了我的计算机,但没有运气。我jupyter在 Macbook Pro 上通过 Anaconda Navigator 启动。
更新:
为了进一步调试,我使用了终端 CLI 并运行:
- 康达安装runipy
- runipy MyNotebook.ipynb
这非常有帮助,因为它产生了以下错误消息,然后我可以用谷歌搜索:
OMP: Error #15: Initializing libomp.dylib, but found libiomp5.dylib already initialized.
OMP: Hint This means that multiple copies of the OpenMP runtime have been linked into the program. That is dangerous, since it can degrade performance or cause incorrect results. The best thing to do is to ensure that only a single OpenMP runtime is linked into the process, e.g. by avoiding static linking of the OpenMP runtime in any library. As an unsafe, unsupported, undocumented workaround you can set the environment variable KMP_DUPLICATE_LIB_OK=TRUE to allow the program to continue to execute, but that may cause crashes or silently produce incorrect results. For more information, please see http://openmp.llvm.org/
我找到了这个线程:
https://github.com/dmlc/xgboost/issues/1715
这表明其他人在 MacOS 上遇到了同样的问题。我还怀疑问题与matplotlib某种方式有关,因为运行任何带有 matplotlib相关代码的单元都会以某种方式导致xgb.train杀死内核。