您好,当我有超过 6 个 feutres 的数据集时,我正在寻找使用 python 进行 K-Means 聚类的示例。谢谢
K-Means 可视化问题 8 个数值特征
数据挖掘
Python
数据挖掘
k-均值
2022-03-01 21:25:43
1个回答
你试图做什么还不够清楚。如果我理解正确,您想训练 K-Means 聚类并可视化结果。但是,您的数据集中有 8 个维度,显然,您无法绘制这样的空间。
您可以做的是减少二维的维数,然后创建该图。
例如,
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.decomposition import PCA
from sklearn.cluster import KMeans
# read my data with pandas into a dataframe
data = pd.read_csv("data.csv")
# run a KMeans model with 3 clusters. Change that number to what you want
clustering_kmeans = KMeans(n_clusters=3, precompute_distances="auto", n_jobs=-1)
clusters = clustering_kmeans.fit_predict(data)
# run PCA to reduce the dimensionality to 2 dimensions
reduced_data = PCA(n_components=2).fit_transform(data)
# create a new dataframe that contains the 2 dimensions and the cluster label
results = pd.DataFrame(reduced_data,columns=['pca1','pca2'])
results['label'] = clusters
# plot the results with a scatterplot
sns.scatterplot(x="pca1", y="pca2", hue=label, data=reduced_data)
plt.show()