我正在尝试SelectKBest选择最重要的功能:
# SelectKBest:
from sklearn.feature_selection import SelectKBest
from sklearn.feature_selection import chi2
sel = SelectKBest(chi2, k='all')
# Load Dataset:
from sklearn import datasets
iris = datasets.load_iris()
# Run SelectKBest on scaled_iris.data
newx = sel.fit_transform(iris.data, iris.target)
print(newx[0:5])
它工作正常,输出为:
[[5.1 3.5 1.4 0.2]
[4.9 3. 1.4 0.2]
[4.7 3.2 1.3 0.2]
[4.6 3.1 1.5 0.2]
[5. 3.6 1.4 0.2]]
但是,当我尝试SelectKBest在缩放数据上使用时,出现错误:
# Scale iris.data
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
X = scaler.fit_transform(iris.data)
# Run SelectKBest on scaled_iris.data
newx = sel.fit_transform(X, iris.target)
输出错误:
ValueError: Input X must be non-negative.
如何缩放数据以便为此目的没有负值?还是在从数据集中选择特征时完全不需要缩放?