我通过分析两组不同的新闻论文文章获得了两组集群(Cluster_set_1 和 Cluster_set_2)。
一个集群包括以下示例中给出的单词/单词短语。
C1 in Cluster_set_1: Energy, Fuel, Oil, Mining
C2 in Cluster_set_1: school, education, students, schools, million, read
...
...
etc.
C1 in Cluster_set_2: Gas, oil, pipeline
C2 in Cluster_set_2: program, business, management,information, reports
...
...
etc.
现在,我想通过考虑集群中的单词/单词短语,在两个集群集中找到相似/相关的集群,如下例所示。
Example:
Cluster of 'Energy, Fuel, Oil, Mining' in Cluster_set_1 is mostly similar/related to
Cluster of 'Gas, oil, pipeline' in Cluster_set_2
Reason: Because they both includes word/word phrases related to Energy
因为,我正在处理两组不同的集群,我可以使用什么合适的方法来连接两组不同的集群?