数据挖掘 - 我应该用蛮力拟合我的参数吗 - 吾爱随笔录

我正在对我公司制造的这种类型的传感器的数据进行分析。我想使用以下公式根据三个特征来量化传感器的健康状况：

传感器健康指数 = feature1 * A + feature2 * B + feature3 *C

我们还需要选择一个阈值，这样如果这个指标超过阈值，传感器就会被认为是坏传感器。

我们只有一个旧列表，其中显示大约 100 个传感器是坏的。但现在我们拥有超过 10,000 个传感器的数据。不在这 100 个传感器列表中的任何东西都不一定是坏的。所以我猜线性回归方法在这种情况下不起作用。

我能想到的唯一方法是蛮力拟合。伪代码如下：

# class definition for params(coefficients)
class params{
  a
  b
  c
  th
}


# dictionary of parameter and accuracy rate
map = {}

for thold in range (1..20):
   for a in range (1..10):
      for b in range (1..10):
        for b in range (1..10):
           # bad sensor list
           bad_list = []
           params = new params[a, b, c, thold]
           for each sensor:
             health_index = sensor.feature1*a+sensor.feature2*b+sensor.feature3*c
             if health_index > thold:
               bad_list.append(sensor.id)
           accuracy = percentage of common sensors between bad_list and known_bad_sensors
           map[params] = accuracy

# rank params based on accuracy
rank(map)
# the params with most accuracy is the best model
print map.index(0)

我真的不喜欢这种方法，因为它使用了 5 个非常有效的循环。我想知道是否有更好的方法来做到这一点。也许使用现有的库，如 sk-learn？