数据挖掘 - Twitter 转推网络可视化 - 吾爱随笔录

Twitter 转推网络可视化

数据挖掘可视化社会网络分析 matplotlib 推特

2022-02-09 18:10:33

我正在尝试可视化转推网络，以找出哪些用户最有可能对其他用户产生最大影响。这是我的代码：

    import networkx as nx

    G_retweet = nx.from_pandas_edgelist(translated_iranian_tweets,
                                        source = "userid",
                                        target = "retweet_userid",
                                        create_using = nx.DiGraph())

    print('There are {} Nodes inside Retweet Network'.format(len(G_retweet.nodes())))
    print('There are {} Edges inside Retweet Network'.format(len(G_retweet.edges())))

import matplotlib.pyplot as plt

#Size varies by the number of edges the node has (its degree)
sizes = [x[1] for x in G_retweet.degree()]

    nx.draw_networkx(G_retweet,
                     pos = nx.circular_layout(G_retweet),
                     with_labels = False,
                     node_size = sizes,
                     width = 0.1,
                     alpha = 0.7,
                     arrowsize = 2,
                     linewidths = 0)

    plt.axis('off')
    plt.show()

该网络内有 18631 个节点和 35008 条边。可视化是可怕的，你什么都看不到。有没有人有任何建议我应该怎么做？我应该尝试使用特定推文提取特定类型的用户以减小数据集的大小，然后尝试可视化网络，还是其他？

1个回答

为了回答图表中的问题，您不应该将其可视化。可视化图表是为了大致了解它的外观。有一些图形可视化技术显示图形是为了获得一些初步的见解（例如，是否存在视觉上明显的社区）。

你的问题有一个分析答案。构建图形对象后，您可以将邻接矩阵作为 Python 数组或矩阵获取。这个矩阵将是不对称的，因为传出和传入的度数当然是不同的。在 Networkx 中，您可以获得如下所示的入度和出度信息。然后主线故事开始！

一个社交网络中最有影响力的可以看作是最中心的节点。中心性度量为您捕获它们。最简单的一种是度中心性。它只是说度数最高的节点对图中的影响最大。在您的情况下，请确保您如何塑造“被转发的人”，因为他具有最大的影响力。如果在您的建模中被转发是正确的，那么这段代码会为您获取它们：

import networkx as nx
import operator

g=nx.digraph.DiGraph()
g.add_edges_from([(0,1),(1,2),(2,3),(4,2),(4,3)])
dict_deg = {ii:jj for (ii,jj) in g.in_degree}
print('In-Degree Dictionary\n',dict_deg)

m = max(list(dict_deg.values()))
print('The node(s) number',[i for i, j in enumerate(list(dict_deg.values())) if j == m],'have the most influence!')

输出：

In-Degree Dictionary
 {0: 0, 1: 1, 2: 2, 3: 2, 4: 0}
The node(s) number [2, 3] have the most influence!

和图表本身：

pos = nx.spring_layout(g)
_=nx.draw(g,label=True,pos=pos)
_=nx.draw_networkx_labels(g,pos=pos)

我希望它回答了你的问题。如果没有，请发表评论，以便我更新。

其它你可能感兴趣的问题

上一篇将具有相同含义的特定领域单词/短语分组下一篇我的井字游戏 Q 学习算法出了什么问题？