我有未标记的数据集。我正在运行具有 2 个集群的 k-means 平面集群。每次我运行以下程序时,标签都是不同的。我怎样才能使标签不改变。甚至可能吗?
X = np.array([[1, 2],
[5, 8],
[1.5, 1.8],
[8, 8],
[1, 0.6],
[9, 11]])
kmeans=KMeans(n_clusters=2)
kmeans.fit(X)
centeroids=kmeans.cluster_centers_
labels=kmeans.labels_
colors = ["g.","r."]
for i in range(len(X)):
print("coordinate:",X[i], "label:", labels[i])
plt.plot(X[i][0], X[i][1], colors[labels[i]], markersize = 10)
plt.scatter(centeroids[:, 0],centeroids[:, 1], marker = "x", s=150, linewidths = 5, zorder = 10)
print centeroids
print labels
plt.show()
第一次运行标签为 [0 1 0 1 0 1]。一秒运行标签为 [1 0 1 0 1 0]。我怎样才能修复它?
On the first run, this is how clusters are assigned to the dataset.
[1, 2] ------>0
[5, 8] ---------->1
[1.5, 1.8] ---------> 0
[8, 8] ---------->1
[1, 0.6] ---------> 0
[9, 11]----------->1
On the second run, this is how clusters are assigned to the dataset.
[1, 2] ------>1
[5, 8] ---------->0
[1.5, 1.8] ---------> 1
[8, 8] ---------->0
[1, 0.6] ---------> 1
[9, 11]----------->0
我怎样才能让它不改变?