我有大约 500k 行数据。我是数据科学的新手，我正在尝试使用支持向量机训练一个模型作为我分析的一部分。在我的小 macbook pro 上，它似乎无穷无尽。现在我正在使用 RBF 内核，并且没有看到这个计算的结束。

但是，我确实可以访问 100 多个计算节点，每个节点有 16 个内核。我的问题是，我不知道如何在我的范围内利用它，因为我缺乏关于如何处理这个 SVM 的知识。现在我正在使用 scipy。

科学代码

def makeModelandPrediction(trainData, trainLabel, testData):
    model = svm.SVC(
        C=1.0,
        cache_size=200,
        class_weight=None,
        coef0=0.0,
        decision_function_shape=None,
        degree=3,
        gamma='auto',
        kernel='rbf',
        max_iter=-1,
        probability=False,
        random_state=None,
        shrinking=True,
        tol=0.001,
        verbose=False,
        )
    model.fit(trainData, trainLabel)
    prediction = model.predict(testData)
    return prediction

我已经对数据进行了预处理，并对数据进行了 70/30 的训练/测试拆分。有人可以为初学者指出正确的方向吗？

clf = BaggingClassifier(SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0, decision_function_shape=None, degree=3, gamma='auto', kernel='rbf', max_iter=-1, probability=False, random_state=None, shrinking=True, tol=0.001, verbose=False, ))

如何在 81 列的 500k 行上运行 SVM？

科学代码