我的问题与:
statsmodel OLS 和 scikit 线性回归的区别
我基本上有同样的问题,除了我的结果更加不同。执行以下简单的线性回归,我得到了几乎完全相反的决定系数结果:
import statsmodels.api as sm
from sklearn import linear_model
x1 = [26.0, 31.0, 47.0, 51.0, 50.0, 49.0, 37.0, 33.0, 49.0, 54.0, 31.0, 49.0, 48.0, 49.0, 49.0, 47.0, 44.0, 48.0, 35.0, 43.0]
y1 = [116.0, 94.0, 100.0, 102.0, 116.0, 116.0, 68.0, 118.0, 91.0, 104.0, 78.0, 116.0, 90.0, 109.0, 116.0, 118.0, 108.0, 119.0, 110.0, 102.0]
# Fit and summarize statsmodel OLS model
model_sm = sm.OLS(x1, y1)
result_sm = model_sm.fit()
print(result_sm.summary())
# Create sklearn linear regression object
ols_sk = linear_model.LinearRegression(fit_intercept=True)
# fit model
model_sk = ols_sk.fit(pd.DataFrame(x1), pd.DataFrame(y1))
# sklearn coefficient of determination
coefofdet = model_sk.score(pd.DataFrame(x1), pd.DataFrame(y1))
print('sklearn R^2: ' + str(coefofdet))
Statsmodels 给我一个0.962,而 sklearn 给了我一个0.0584069073664。
是什么导致了如此巨大的差异?