数据挖掘 - 使用 ML 算法的输出作为不同 ML 算法的输入（集成学习） - 吾爱随笔录

我想为多个模型分配权重并制作一个集成模型。我想将我的输出用作新机器学习算法的输入，该算法将学习正确的权重。但是当我得到这样的输出时，如何将多个模型的输出作为新 ML 算法的输入

preds1=model1.predict_prob(xx)
[[0.28054154 0.35648097 0.32954868 0.03342881]
 [0.20625692 0.30749627 0.37018309 0.11606372]
 [0.28362306 0.33325501 0.34658685 0.03653508]
 ...

 preds2=model2.predict_prob(xx)
[[0.22153498 0.30271243 0.26420254 0.21155006]
 [0.32327647 0.39197589 0.23899729 0.04575035]
 [0.18440374 0.32447016 0.4736297  0.0174964 ]
 ...

如何从这 2 个或更多模型的输出中制作单个 Dataframe？

下面给出了最简单的方法，但我想将输出提供给不同的 ML 算法来学习权重。

model = LogisticRegression()
            model.fit(xx_train, yy_train)
            preds1 = model.predict_proba(xx_test)
         
    
            model = KNeighborsClassifier(n_neighbors=5, metric='minkowski', p=2 )
            model.fit(xx_train, yy_train)
            preds2 = model.predict_proba(xx_test)
            
            # Each weight is evaluated by calculating the corresponding score
            for i in range(len(weights)):
             final_inner_preds = np.argmax(preds1*weights[i]+ preds2*(1-weights[i]), axis=1)
           scores_corr_wts[i]+= accuracy_score(yy_test, final_inner_preds)

from sklearn.datasets import load_iris from sklearn.ensemble import RandomForestClassifier from sklearn.svm import LinearSVC from sklearn.linear_model import LogisticRegression from sklearn.preprocessing import StandardScaler from sklearn.pipeline import make_pipeline from sklearn.ensemble import StackingClassifier X, y = load_iris(return_X_y=True) # Single models estimators = [ ('rf', RandomForestClassifier(n_estimators=10, random_state=42)), ('svr', make_pipeline(StandardScaler(), LinearSVC(random_state=42)))] # Stack both single models clf = StackingClassifier( estimators=estimators, final_estimator=LogisticRegression()) from sklearn.model_selection import train_test_split X_train, X_test, y_train, y_test = train_test_split( X, y, stratify=y, random_state=42) # Fit model clf.fit(X_train, y_train).score(X_test, y_test)