问题:我想实现一个决策树,每个叶子都是线性回归,这样的模型是否存在(最好在 sklearn 中)?
案例一:
样机数据使用以下公式生成:
y = int(x) + x * 1.5
看起来像:
我想使用决策树来解决这个问题,其中最终决策会产生线性公式。就像是:
- 0 <= x < 1 -> y = 0 + 1.5 * x
- 1 <= x < 2 -> y = 1 + 1.5 * x
- 2 <= x < 3 -> y = 2 + 1.5 * x
- 等等。
我认为最好使用决策树来完成。我做了一些谷歌搜索,我认为DecisionTreeRegressor
fromsklearn.tree
可以工作,但这会导致点在一个范围内被分配一个恒定值,如下所示:
代码:
import matplotlib.pyplot as plt
import numpy as np
from sklearn.tree import DecisionTreeRegressor
x = np.linspace(0, 5, 100)
y = np.array([int(i) for i in x]) + x * 1.5
x_train = np.linspace(0, 5, 10)
y_train = np.array([int(i) for i in x_train]) + x_train * 1.5
clf = DecisionTreeRegressor()
clf.fit(x_train.reshape((len(x_train), 1)), y_train.reshape((len(x_train), 1)))
y_result = clf.predict(x.reshape(len(x), 1))
plt.plot(x, y, label='actual results')
plt.plot(x, y_result, label='model predicts')
plt.legend()
plt.show()
示例案例 2: 不是一个输入,而是两个输入:x1 和 x2,输出计算如下:
- x1 = 0 -> y = 1 * x2
- x1 = 1 -> y = 3 * x2 + 5
- x1 = 6 -> y = -1 * x2 -4
- 否则 -> y = x2 * 20 - 100
代码:
import matplotlib.pyplot as plt
import random
def get_y(x1, x2):
if x1 == 0:
return x2
if x1 == 1:
return 3 * x2 + 5
if x1 == 6:
return - x2 - 4
return x2 * 20 - 100
X_0 = [(0, random.random()) for _ in range(100)]
x2_0 = [i[1] for i in X_0]
y_0 = [get_y(i[0], i[1]) for i in X_0]
X_1 = [(1, random.random()) for _ in range(100)]
x2_1 = [i[1] for i in X_1]
y_1 = [get_y(i[0], i[1]) for i in X_1]
X_2 = [(6, random.random()) for _ in range(100)]
x2_2 = [i[1] for i in X_2]
y_2 = [get_y(i[0], i[1]) for i in X_2]
X_3 = [(random.randint(10, 100), random.random()) for _ in range(100)]
x2_3 = [i[1] for i in X_3]
y_3 = [get_y(i[0], i[1]) for i in X_3]
plt.scatter(x2_0, y_0, label='x1 = 0')
plt.scatter(x2_1, y_1, label='x1 = 1')
plt.scatter(x2_2, y_2, label='x1 = 6')
plt.scatter(x2_3, y_3, label='x1 not 0, 1 or 6')
plt.grid()
plt.xlabel('x2')
plt.ylabel('y')
plt.legend()
plt.show()
所以我的问题是:每个叶子都是线性回归的决策树是否存在?