带有线性模型的ggplot(在R中)/matplotlib(在Python中)?

数据挖掘 Python r matplotlib ggplot2
2021-09-19 03:37:18

如何使用ggplot( R) 和/或matplotlib( Python) 绘制下面的直方图?

在基础图中,我使用:

histogram(~ Wrkday | Year+Avg, data=Data, layout=c(3,2))

按基图作图

我处理的数据:

在此处输入图像描述

R中的代码:

Input <- ("
Year       Student  Wrkday
      '1st year'  a        1200
      '1st year'  b        1400
      '1st year'  c        1350
      '1st year'  d         950
      '1st year'  e        1400
      '1st year'  f        1150
      '1st year'  g        1300
      '1st year'  h        1325
      '1st year'  i        1425
      '1st year'  j        1500
      '1st year'  k        1250
      '1st year'  l        1150
      '1st year'  m         950
      '1st year'  n        1150
      '1st year'  o        1600
      '1st year'  p        1300
      '1st year'  q        1050
      '1st year'  r        1300
      '1st year'  s        1700
      '1st year'  t        1300
      '2nd year'  u        1100
      '2nd year'  v        1200
      '2nd year'  w        1250
      '2nd year'  x        1050
      '2nd year'  y        1200
      '2nd year'  z        1250
      '2nd year'  aa       1350
      '2nd year'  ab       1350
      '2nd year'  ac       1325
      '2nd year'  ad       1525
      '2nd year'  ae       1225
      '2nd year'  af       1125
      '2nd year'  ag       1000
      '2nd year'  ah       1125
      '2nd year'  ai       1400
      '2nd year'  aj       1200
      '2nd year'  ak       1150
      '2nd year'  al       1400
      '2nd year'  am       1500
      '2nd year'  an       1200
      '3rd year'  u        1600
      '3rd year'  v        1700
      '3rd year'  w        1450
      '3rd year'  x        1650
      '3rd year'  y        1800
      '3rd year'  z        1550
      '3rd year'  aa       1950
      '3rd year'  ab       1750
      '3rd year'  ac       1925
      '3rd year'  ad       1825
      '3rd year'  ae       1625
      '3rd year'  af       1525
      '3rd year'  ag       1800
      '3rd year'  ah       1725
      '3rd year'  ai       1200
      '3rd year'  aj       1600
      '3rd year'  ak       1950
      '3rd year'  al       1100
      '3rd year'  am       1400
      '3rd year'  an       1600
      ")


Data <- read.table(textConnection(Input),header=TRUE)
los<-rbinom(nrow(Data), size = 1, prob=0.7)
Data$Avg<-ifelse(los==1,"Above 4.0","Below 4.0")

我想在每个框中绘制线性模型而不是直方图。

1个回答

我不知道标题中的“使用线性模型”是什么意思,但这里的代码可以生成玩具数据集并复制您的情节。

library(tidyverse)
x<-crossing(year=paste("Year", 1:3), avg=c("Above 4.0", "Below 4.0"))
x$dat<-replicate(6, tibble(wrkday=runif(100, 1000, 2000)))
x %>% 
  unnest(dat) %>% 
  ggplot(aes(dat)) + 
  geom_histogram(aes(y=..ncount..), bins=20) + 
  facet_grid(avg~year)

在此处输入图像描述

编辑:

鉴于您的评论,我想我理解您所说的“使用线性模型”是什么意思。您想查看 wrkday 作为年份的函数如何变化,由 avg 表示。这是一些代码:

library(tidyverse)

x<-tibble(year=sample(factor(paste("Year", 1:3)), 600, replace=T), 
          avg=sample(c("Above 4.0", "Below 4.0"), 600, replace=T)) %>%
  mutate(wrkday=rnorm(600, mean=1000*(as.integer(year)-1)/2, sd=300))

ggplot(x, aes(x=as.integer(year), y=wrkday)) + 
  geom_jitter(width=0.2) + 
  geom_smooth(method="lm") + 
  facet_wrap(~avg)

在此处输入图像描述