我正在使用 Kaggle 的泰坦尼克号套装。我正在使用 pieplines,我正在尝试修剪我的决策树,为此我想要 cost_complexity_pruning_path。最后一行代码产生错误:
ValueError: could not convert string to float: 'male'
你知道我做错了什么吗?我看过Sklearn:应用成本复杂性修剪和管道,但这似乎对我的情况没有帮助
cat_vars = ['Sex','Embarked']
num_vars = ['Age']
num_pipe = Pipeline([('imputer', SimpleImputer(strategy='mean')),('std_scaler', StandardScaler())])
cat_pipe = Pipeline([('imputer', SimpleImputer(strategy='most_frequent')),('ohe', OneHotEncoder())])
col_trans = ColumnTransformer([('numerical', num_pipe, num_vars),('categorical', cat_pipe, cat_vars)] ,remainder='passthrough')
final_pipe = Pipeline([('column_trans', col_trans), ('tree', DecisionTreeClassifier(random_state=42))])
final_pipe.fit(X_train, y_train)
path = final_pipe.steps[1][1].cost_complexity_pruning_path(X_train, y_train)