我有 X_train 形状 (14599, 13),我试图用列的中位数估算 NaN,但不知何故,它用行结果错误估算,因为在一行中有日期,而不是整数值。如果 SimpleImputer 具有轴参数,我已经查找但找不到它存在。如何解决这个问题?
import pandas as pd
import seaborn as sns
import numpy as np
import matplotlib.pyplot as plt
from sklearn.impute import SimpleImputer
from sklearn.model_selection import train_test_split
plt.close('all')
avo_sales = pd.read_csv('avocados.csv')
avo_sales.rename(columns = {'4046':'small PLU sold',
'4225':'large PLU sold',
'4770':'xlarge PLU sold'},
inplace= True)
avo_sales.columns = avo_sales.columns.str.replace(' ','')
plt.scatter(avo_sales.Date,avo_sales.TotalBags)
x = np.array(avo_sales.drop(['TotalBags'],1))
y = np.array(avo_sales.TotalBags)
X_train, X_test, y_train, y_test = train_test_split(x, y, test_size=0.2)
imp = SimpleImputer(strategy='median')
X_train = imp.fit_transform(X_train)
输出
ValueError: Cannot use median strategy with non-numeric data:
could not convert string to float: '12/31/2017'
```