我想对来自deslib python 包的OLA()(整体局部精度)模型的搜索池分类器超参数进行网格化。
from sklearn.datasets import make_classification
from sklearn.model_selection import RepeatedStratifiedKFold
from sklearn.model_selection import cross_val_score
from deslib.dcs.ola import OLA
from sklearn.model_selection import GridSearchCV
from sklearn.linear_model import LogisticRegression
from sklearn.tree import DecisionTreeClassifier
from sklearn.naive_bayes import GaussianNB
然后 :
X , y = make_classification( n_samples = 10000 , n_features = 20 , n_informative = 15 , n_redundant = 5 , random_state = 999 )
model = OLA()
cv = RepeatedStratifiedKFold( n_splits = 10 , n_repeats = 3 , random_state = 999 )
grid = dict()
grid[ 'pool_classifiers' ] = [ [ LogisticRegression() , DecisionTreeClassifier() , GaussianNB() ] ,
[ LogisticRegression() , DecisionTreeClassifier() ] ]
search = GridSearchCV( model , grid , scoring = 'accuracy' , cv = cv )
search_results = search.fit( X , y )
但是会引发以下错误消息:
NotFittedError: This LogisticRegression instance is not fitted yet. Call 'fit' with appropriate arguments before using this estimator.
这意味着必须在网格搜索之前拟合池中的模型,但我认为拟合发生在 cv 步骤的每个火车折叠上。
这是否意味着我必须让每个模型都适应训练折叠?
感谢您在此主题上提供帮助。