我如何适合 KNN 来获得最近的邻居,然后在 Scikit-Learn 中使用线性回归(而不是加权平均值)将这些邻居聚合成一个拟合?
我尝试使用KNeighborsTransformer
然后创建管道,LinearRegression
但这似乎没有做正确的事情。
我如何适合 KNN 来获得最近的邻居,然后在 Scikit-Learn 中使用线性回归(而不是加权平均值)将这些邻居聚合成一个拟合?
我尝试使用KNeighborsTransformer
然后创建管道,LinearRegression
但这似乎没有做正确的事情。
KNeighborsTransformer
只给你最近邻居的索引和距离。您需要做更多的工作来检索点以适合您的线性回归。
这是一个似乎有效的草稿:
from sklearn.neighbors import NearestNeighbors
from sklearn.base import RegressorMixin, BaseEstimator, clone
from sklearn.linear_model import Lasso
from sklearn.utils import check_X_y
import numpy as np
class LocalLinearRegressor(RegressorMixin, BaseEstimator):
def __init__(self, n_neighbors=10, linear_model=Lasso()):
self.n_neighbors = n_neighbors
self.linear_model = linear_model
def fit(self, X, y=None):
"Fits the neighbors search."
X, y = check_X_y(X, y)
self._fit_X = X
self._fit_y = y
self.neighbor_search = NearestNeighbors(n_neighbors=self.n_neighbors)
self.neighbor_search.fit(X)
self.local_regressors_ = {}
return self
def predict(self, X):
"""Fits linear regressions on the k nearest training points to predict new values.
We don't fit these linear regressions at fit time because there would be so many.
However, we do save the regressions as we see them to speed up predictions.
"""
neighbors = self.neighbor_search.kneighbors(X, return_distance=False)
ksets, mapper = np.unique(neighbors, return_inverse=True, axis=0)
for kset in ksets:
if tuple(kset) in self.local_regressors_:
continue
local_X = self._fit_X[kset, :]
local_y = self._fit_y[kset]
self.local_regressors_[tuple(kset)] = clone(self.linear_model).fit(local_X, local_y)
return np.array([
self.local_regressors_[tuple(ksets[mapper[i]])].predict(X[i, :].reshape(1, -1))[0]
for i in range(X.shape[0])
])
这是一个 Colab 笔记本,展示了它的实际应用。