有人可以建议如何从数据框中删除本地异常值吗?我有检测本地异常值的代码,但我需要帮助在数据框中删除它们(将这些值设置为零)。任何建议将不胜感激。
检测局部异常值的代码如下:
def printOutliers(series, window, scale= 1.96, print_outliers=False):
rolling_mean = series.rolling(window=window).mean()
#Print indices of outliers
if print_outliers:
mae = mean_absolute_error(series[window:], rolling_mean[window:])#mean absolute error is a measure of difference between two continuous variables.
deviation = 3*np.std(series[window:] - rolling_mean[window:])
lower_bound = rolling_mean - (mae + scale * deviation)
upper_bound = rolling_mean + (mae + scale * deviation)
outliers_lower = series[series<lower_bound]
outliers_upper = series[series>upper_bound]
print("values beyond lower bound are: " + "\n" + str(outliers_lower))
print("values beyond lower bound are: " + "\n" + str(outliers_upper))
printOutliers(df['Column1'].dropna(how='any'), 10, print_outliers=True)