这个带有两个连接点的行的图的名称是什么?
机器算法验证
数据可视化
术语
擅长
2022-01-31 00:34:01
3个回答
那是一个点图。它有时被称为“克利夫兰点图”,因为有一种由点组成的直方图的变体,人们有时也称之为点图。这个特定版本为每个国家(两年)绘制了两个点,并在它们之间画了一条更粗的线。这些国家按后一个值排序。主要参考资料是克利夫兰的书Visualizing Data。谷歌搜索将我带到这个 Excel 教程。
我抓取了数据,以防有人想玩它们。
Country 1990 2015
Russia 71.5 101.4
Canada 74.4 102.9
Other non-OECD Europe/Eurasia 60.9 135.2
South Korea 127.0 136.2
China 58.5 137.1
Middle East 170.9 158.8
United States 106.8 169.0
Australia/New Zealand 123.6 170.9
Brazil 208.5 199.8
Japan 181.0 216.7
Africa 185.4 222.0
Other non-OECD Asia 202.7 236.0
OECD Europe 173.8 239.9
Other non-OECD Americas 193.1 242.3
India 173.8 260.6
Mexico/Chile 221.1 269.8
@gung 的答案在识别图表类型和提供如何在 Excel 中实现的链接方面是正确的,正如 OP 所要求的那样。但是对于其他想知道如何在 R/tidyverse/ggplot 中执行此操作的人,下面是完整的代码:
library(dplyr) # for data manipulation
library(tidyr) # for reshaping the data frame
library(stringr) # string manipulation
library(ggplot2) # graphing
# create the data frame
# (in wide format, as needed for the line segments):
dat_wide = tibble::tribble(
~Country, ~Y1990, ~Y2015,
'Russia', 71.5, 101.4,
'Canada', 74.4, 102.9,
'Other non-OECD Europe/Eurasia', 60.9, 135.2,
'South Korea', 127, 136.2,
'China', 58.5, 137.1,
'Middle East', 170.9, 158.8,
'United States', 106.8, 169,
'Australia/New Zealand', 123.6, 170.9,
'Brazil', 208.5, 199.8,
'Japan', 181, 216.7,
'Africa', 185.4, 222,
'Other non-OECD Asia', 202.7, 236,
'OECD Europe', 173.8, 239.9,
'Other non-OECD Americas', 193.1, 242.3,
'India', 173.8, 260.6,
'Mexico/Chile', 221.1, 269.8
)
# a version reshaped to long format (for the points):
dat_long = dat_wide %>%
gather(key = 'Year', value = 'Energy_productivity', Y1990:Y2015) %>%
mutate(Year = str_replace(Year, 'Y', ''))
# create the graph:
ggplot() +
geom_segment(data = dat_wide,
aes(x = Y1990,
xend = Y2015,
y = reorder(Country, Y2015),
yend = reorder(Country, Y2015)),
size = 3, colour = '#D0D0D0') +
geom_point(data = dat_long,
aes(x = Energy_productivity,
y = Country,
colour = Year),
size = 4) +
labs(title = 'Energy productivity in selected countries \nand regions',
subtitle = 'Billion dollars GDP per quadrillion BTU',
caption = 'Source: EIA, 2016',
x = NULL, y = NULL) +
scale_colour_manual(values = c('#1082CD', '#042B41')) +
theme_bw() +
theme(legend.position = c(0.92, 0.20),
legend.title = element_blank(),
legend.box.background = element_rect(colour = 'black'),
panel.border = element_blank(),
axis.ticks = element_line(colour = '#E6E6E6'))
ggsave('energy.png', width = 20, height = 10, units = 'cm')
这可以扩展为添加值标签并突出显示值交换顺序的一种情况的颜色,就像在原始情况中一样。
有人称其为具有两组的(水平)棒棒糖情节。
以下是如何使用matplotlib
和seaborn
(仅用于样式)在 Python 中制作此图,改编自https://python-graph-gallery.com/184-lollipop-plot-with-2-groups/并应要求评论中的OP。
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import io
sns.set(style="whitegrid") # set style
data = io.StringIO(""""Country" 1990 2015
"Russia" 71.5 101.4
"Canada" 74.4 102.9
"Other non-OECD Europe/Eurasia" 60.9 135.2
"South Korea" 127.0 136.2
"China" 58.5 137.1
"Middle East" 170.9 158.8
"United States" 106.8 169.0
"Australia/New Zealand" 123.6 170.9
"Brazil" 208.5 199.8
"Japan" 181.0 216.7
"Africa" 185.4 222.0
"Other non-OECD Asia" 202.7 236.0
"OECD Europe" 173.8 239.9
"Other non-OECD Americas" 193.1 242.3
"India" 173.8 260.6
"Mexico/Chile" 221.1 269.8""")
df = pd.read_csv(data, sep="\s+", quotechar='"')
df = df.set_index("Country").sort_values("2015")
df["change"] = df["2015"] / df["1990"] - 1
plt.figure(figsize=(12,6))
y_range = np.arange(1, len(df.index) + 1)
colors = np.where(df['2015'] > df['1990'], '#d9d9d9', '#d57883')
plt.hlines(y=y_range, xmin=df['1990'], xmax=df['2015'],
color=colors, lw=10)
plt.scatter(df['1990'], y_range, color='#0096d7', s=200, label='1990', zorder=3)
plt.scatter(df['2015'], y_range, color='#003953', s=200 , label='2015', zorder=3)
for (_, row), y in zip(df.iterrows(), y_range):
plt.annotate(f"{row['change']:+.0%}", (max(row["1990"], row["2015"]) + 4, y - 0.25))
plt.legend(ncol=2, bbox_to_anchor=(1., 1.01), loc="lower right", frameon=False)
plt.yticks(y_range, df.index)
plt.title("Energy productivity in selected countries and regions, 1990 and 2015\nBillion dollars GDP per quadrillion BTU", loc='left')
plt.xlim(50, 300)
plt.gcf().subplots_adjust(left=0.35)
plt.tight_layout()
plt.show()