关于如何可视化这些数据的建议

数据挖掘 r 可视化
2022-03-06 21:42:26

我希望以更好的方式可视化如下所示的数据来挑选人们的大脑:

  • 两个分类变量
  • 一个连续变量

我试图以比热图更合适的方式可视化这些数据。有没有人有什么建议?

这是代码加数据(注意:行数比我提供的示例数据多。):

test_data <- structure(list(Toys = c("Slinky", "Slinky", "Slinky", "Slinky", 
                                     "Slinky", "Slinky", "Tin Solider", "Tin Solider", "Tin Solider", 
                                     "Tin Solider", "Tin Solider", "Tin Solider", "Hungry Hungry Hippo", 
                                     "Hungry Hungry Hippo", "Hungry Hungry Hippo", "Hungry Hungry Hippo", 
                                     "Hungry Hungry Hippo", "Hungry Hungry Hippo"), 
                            Manufacturer = c("Manufacturer A", "Manufacturer B", "Manufacturer C", "Manufacturer A", "Manufacturer A", 
                                             "Manufacturer A", "Manufacturer B", "Manufacturer B", "Manufacturer B", 
                                             "Manufacturer B", "Manufacturer B", "Manufacturer B", "Manufacturer C", 
                                             "Manufacturer C", "Manufacturer C", "Manufacturer C", "Manufacturer C", 
                                             "Manufacturer C"), 
                            Price = c(5.99, 6.99, 7.99, 9, 6, 5.54, 7, 
                                      9.99, 6.99, 6.75, 8, 7.99, 9.99, 7.99, 5.99, 8.99, 10.99, 9.75)), 
                           class = "data.frame", row.names = c(NA, -18L))

melted_test_data <- reshape::melt(test_data %>% select(Toys,Manufacturer, Price))

library(plotly)
library(scales)
plot_test_data <- melted_test_data %>% 
  ggplot(aes(x = Manufacturer, y = Toys, fill = value)) +
  geom_tile() +
  scale_fill_distiller(palette = 'Accent', label =  label_comma()) + 
  theme(panel.background = element_rect(fill = 'white'), axis.text.x = element_text(angle = 45, hjust = 1), plot.title = element_text(hjust = 0.5)) +
  labs(title ="Price by Manufacturer and Toys", x = "Manufacturers", y = "Manufacturer") +
  guides(fill = guide_colourbar(title = "price($)"))
ggplotly(plot_test_data) 
1个回答

如果您想对类别和制造商之间的成本进行某种比较,那么您可以比较平均/中位数的价格。

library(tidyverse)

test_data %>%
  group_by(Toys, Manufacturer) %>%
  summarise(Price=mean(Price)) %>%
  ggplot()+
  geom_bar(aes(x=Manufacturer, y=Price, fill=Toys), stat = 'identity',position = 'dodge')+
  scale_fill_brewer(palette = 'Set1')+
  theme_bw()+
  theme(legend.position = 'top')

在此处输入图像描述

边注:

图的意义?

Hungry Hippo 和 Tin 士兵仅由 1 个制造商制造,而 Slinky 由这三个制造商生产。尽管它是由所有三个制造商生产的,但平均价格各不相同,并且是制造商 C 的最高价格。