我正在尝试使用多项式特征创建一个线性回归模型。但是当我评估它时,我得到了非常奇怪的分数。我知道 R^2 可以应用于这个模型,我想我已经尝试了一切。我真的会提出一个很好的建议。这是我的代码。
X = df_all[['Elevation_gain', 'Distance']]
y = df_all['Avg_tempo_in_seconds']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.3, random_state = 42)
for n in range(2,10,1):
poly_feat = PolynomialFeatures(degree=n, include_bias = True)
X_poly_train = poly_feat.fit_transform(X_train)
X_poly_test = poly_feat.transform(X_test)
lin_reg_2 = LinearRegression()
lin_reg_2.fit(X_poly_train, y_train)
test_pred_2 = lin_reg_2.predict(X_poly_test)
#testset evaluation
r2 = metrics.r2_score(y_true = y_test, y_pred = test_pred_2)
mse = metrics.mean_squared_error(y_true = y_test, y_pred = test_pred_2)
print(round(r2,2))
#print(round(mse,2))
这是我得到的输出:
0.36
-3.99
-59.96
-1299.38
-627.37
-1773329.36
-19673802.94
-23125681.65
这是示例数据:
Elevation_gain | 距离 | Avg_tempo_in_seconds |
---|---|---|
70 | 6,13 | 290.1 |
135 | 9.27 | 301.0 |
10 | 4.94 | 287.5 |
270 | 15.74 | 310.2 |
120 | 8.11 | 298.5 |