因此,我有一个包含大约 1388 种独特产品的数据集,我必须对它们进行无监督学习才能发现异常(高/低峰值)。
以下数据仅代表一种产品。ContextID
是产品编号,表示产品制造的StepID
不同阶段。
ContextID BacksGas_Flow_sccm StepID Time_ms
427 7290057 1.7578125 1 09:20:15.273
428 7290057 1.7578125 1 09:20:15.513
429 7290057 1.953125 2 09:20:15.744
430 7290057 1.85546875 2 09:20:16.814
431 7290057 1.7578125 2 09:20:17.833
432 7290057 1.7578125 2 09:20:18.852
433 7290057 1.7578125 2 09:20:19.872
434 7290057 1.7578125 2 09:20:20.892
435 7290057 1.7578125 2 09:20:22.42
436 7290057 16.9921875 5 09:20:23.82
437 7290057 46.19140625 5 09:20:24.102
438 7290057 46.19140625 5 09:20:25.122
439 7290057 46.6796875 5 09:20:26.142
440 7290057 46.6796875 5 09:20:27.162
441 7290057 46.6796875 5 09:20:28.181
442 7290057 46.6796875 5 09:20:29.232
443 7290057 46.6796875 5 09:20:30.361
444 7290057 46.6796875 5 09:20:31.381
445 7290057 46.6796875 5 09:20:32.401
446 7290057 46.6796875 5 09:20:33.431
447 7290057 46.6796875 5 09:20:34.545
448 7290057 46.6796875 5 09:20:34.761
449 7290057 46.6796875 5 09:20:34.972
450 7290057 46.6796875 5 09:20:36.50
451 7290057 46.6796875 5 09:20:37.120
452 7290057 46.6796875 7 09:20:38.171
453 7290057 46.6796875 7 09:20:39.261
454 7290057 46.6796875 7 09:20:40.280
455 7290057 46.6796875 12 09:20:41.429
456 7290057 46.6796875 12 09:20:42.449
457 7290057 46.6796875 12 09:20:43.469
458 7290057 46.6796875 12 09:20:44.499
459 7290057 46.6796875 12 09:20:45.559
460 7290057 46.6796875 12 09:20:45.689
461 7290057 47.16796875 12 09:20:46.710
462 7290057 46.6796875 12 09:20:47.749
463 7290057 46.6796875 15 09:20:48.868
464 7290057 46.6796875 15 09:20:49.889
465 7290057 46.6796875 16 09:20:50.910
466 7290057 46.6796875 16 09:20:51.938
467 7290057 24.21875 19 09:20:52.999
468 7290057 38.76953125 19 09:20:54.27
469 7290057 80.46875 19 09:20:55.68
470 7290057 72.75390625 19 09:20:56.128
471 7290057 59.5703125 19 09:20:57.247
472 7290057 63.671875 19 09:20:58.278
473 7290057 70.5078125 19 09:20:59.308
474 7290057 71.875 19 09:21:00.337
475 7290057 69.82421875 19 09:21:01.358
476 7290057 69.23828125 19 09:21:02.408
477 7290057 69.23828125 19 09:21:03.548
478 7290057 72.4609375 19 09:21:04.597
479 7290057 73.4375 19 09:21:05.615
480 7290057 73.4375 19 09:21:06.647
481 7290057 73.4375 19 09:21:07.675
482 7290057 73.4375 19 09:21:08.697
483 7290057 73.4375 19 09:21:09.727
484 7290057 74.21875 19 09:21:10.796
485 7290057 75.1953125 19 09:21:11.827
486 7290057 75.1953125 19 09:21:12.846
487 7290057 75.1953125 19 09:21:13.865
488 7290057 75.1953125 19 09:21:14.886
489 7290057 75.1953125 19 09:21:15.907
490 7290057 75.9765625 19 09:21:16.936
491 7290057 75.9765625 19 09:21:17.975
492 7290057 75.9765625 19 09:21:18.997
493 7290057 75.9765625 19 09:21:20.27
494 7290057 75.9765625 19 09:21:21.55
495 7290057 75.9765625 19 09:21:22.75
496 7290057 75.9765625 19 09:21:23.95
497 7290057 76.85546875 19 09:21:24.204
498 7290057 76.85546875 19 09:21:25.225
499 7290057 76.85546875 19 09:21:25.957
500 7290057 76.85546875 19 09:21:26.984
501 7290057 75.9765625 19 09:21:27.995
502 7290057 75.9765625 19 09:21:29.2
503 7290057 76.7578125 19 09:21:30.13
504 7290057 76.7578125 19 09:21:31.33
505 7290057 76.7578125 19 09:21:32.59
506 7290057 76.7578125 19 09:21:33.142
507 7290057 76.7578125 19 09:21:34.153
508 7290057 75.87890625 19 09:21:34.986
509 7290057 75.87890625 19 09:21:35.131
510 7290057 75.87890625 19 09:21:35.272
511 7290057 75.87890625 19 09:21:35.451
512 7290057 76.7578125 19 09:21:36.524
513 7290057 76.7578125 19 09:21:37.651
514 7290057 76.7578125 19 09:21:38.695
515 7290057 76.7578125 19 09:21:39.724
516 7290057 76.7578125 19 09:21:40.760
517 7290057 76.7578125 19 09:21:41.783
518 7290057 76.7578125 19 09:21:42.802
519 7290057 76.7578125 19 09:21:43.822
520 7290057 76.7578125 19 09:21:44.862
521 7290057 76.7578125 19 09:21:45.884
522 7290057 76.7578125 19 09:21:46.912
523 7290057 76.7578125 19 09:21:47.933
524 7290057 76.7578125 19 09:21:48.952
525 7290057 76.7578125 19 09:21:49.972
526 7290057 76.7578125 19 09:21:51.72
527 7290057 77.5390625 19 09:21:52.290
528 7290057 77.5390625 19 09:21:52.92
529 7290057 77.5390625 19 09:21:53.361
530 7290057 77.5390625 19 09:21:54.435
531 7290057 76.66015625 19 09:21:55.602
532 7290057 76.66015625 19 09:21:56.621
533 7290057 72.94921875 22 09:21:57.652
534 7290057 3.90625 24 09:21:58.749
535 7290057 2.5390625 24 09:21:59.801
536 7290057 2.1484375 24 09:22:00.882
537 7290057 2.05078125 24 09:22:01.259
538 7290057 2.1484375 24 09:22:01.53
539 7290057 1.953125 24 09:22:02.281
540 7290057 1.953125 24 09:22:03.311
541 7290057 2.1484375 24 09:22:04.331
542 7290057 2.1484375 24 09:22:05.351
543 7290057 1.953125 24 09:22:06.432
544 7290057 1.85546875 24 09:22:07.519
545 7290057 1.7578125 24 09:22:08.549
546 7290057 1.85546875 24 09:22:09.710
547 7290057 1.7578125 24 09:22:10.738
548 7290057 1.85546875 24 09:22:11.798
549 7290057 1.953125 24 09:22:12.820
550 7290057 1.85546875 1 09:22:13.610
551 7290057 1.85546875 1 09:22:14.629
552 7290057 1.953125 1 09:22:15.649
553 7290057 1.85546875 2 09:22:16.679
554 7290057 1.85546875 2 09:22:17.709
555 7290057 1.85546875 2 09:22:18.729
556 7290057 1.953125 2 09:22:19.748
557 7290057 1.85546875 2 09:22:20.768
558 7290057 1.7578125 3 09:22:21.788
559 7290057 1.7578125 3 09:22:22.808
560 7290057 1.85546875 3 09:22:23.829
561 7290057 1.953125 3 09:22:24.848
562 7290057 1.85546875 3 09:22:25.898
563 7290057 1.953125 3 09:22:27.39
564 7290057 1.953125 3 09:22:28.66
565 7290057 1.7578125 3 09:22:29.87
566 7290057 1.85546875 3 09:22:30.108
567 7290057 1.7578125 3 09:22:31.129
568 7290057 1.953125 3 09:22:32.147
569 7290057 1.85546875 3 09:22:33.187
我使用以下代码绘制图表。
代码:
lineplot = X.loc[X['ContextID'] == 7290057]
x_axis = lineplot.values[:,3]
y_axis = lineplot.values[:,1]
plt.figure(1)
plt.plot(x_axis, y_axis)
在此图中,峰值(以红色圆圈标记)是需要检测的异常。
当我有这样的图表时:由于没有不受欢迎的峰值,因此必须捕捉到任何异常。
我尝试使用OneClassSVM
,但我对结果不满意。
我想知道哪种无监督学习算法可用于手头的此类任务。