我有一些代码可以在 x,y 数据中查找集群。要检查我使用的集群数量,我想获取 BIC。这是不可能(很容易)使用kmeans()
的,所以我已经切换到mclust 包。具体来说,我正在尝试用mclust 包替换kmeans()
R stats包。Mclust()
使用Mclust()
需要我指定应该使用哪个模型进行聚类。据?Mclust
,以下型号可用于Mclust()
:
univariate mixture
"E" = equal variance (one-dimensional)
"V" = variable variance (one-dimensional)
multivariate mixture
"EII" = spherical, equal volume
"VII" = spherical, unequal volume
"EEI" = diagonal, equal volume and shape
"VEI" = diagonal, varying volume, equal shape
"EVI" = diagonal, equal volume, varying shape
"VVI" = diagonal, varying volume and shape
"EEE" = ellipsoidal, equal volume, shape, and orientation
"EEV" = ellipsoidal, equal volume and equal shape
"VEV" = ellipsoidal, equal shape
"VVV" = ellipsoidal, varying volume, shape, and orientation
single component
"X" = univariate normal
"XII" = spherical multivariate normal
"XXI" = diagonal multivariate normal
"XXX" = ellipsoidal multivariate normal
我假设统计中的 k-means 是一个“球形,不等体积”模型,即。要k-means(x = data, centers = 6)
匹配mclust()
,我应该使用mclust(data, G = 6, modelNames = c("VII"))
.
然而,在我所做的有限测试中,这给出了不同的集群质心。下面的示例使用 6 个集群和一些测试数据。显示了通过每种方法获得的质心。
谁能确认哪个mclust()
型号相当于kmeans()
?