这是您问题的基本解决方案:
aggregate(paste(ID , Date) ~ ID + Date, data = df, FUN = length)
还有更多解决方案,例如以下任何解决方案,使用dplyr:
library(dplyr)
df %>% group_by(ID, Date) %>% summarise(PurchaseCount = n())
df %>% group_by(ID, Date) %>% tally(name="PurchaseCount")
df %>% group_by(ID, Date) %>% count(name="PurchaseCount")
df %>% group_by(ID, Date) %>% add_tally(name="PurchaseCount")
df %>% group_by(ID, Date) %>% add_count(name="PurchaseCount")
或通过使用数据表包裹:
library(data.table)
setDT(df)[, PurchaseCount:=.N, by = list(ID, Date)]
或使用sqldf包裹:
library(sqldf)
sqldf("SELECT ID, Date, COUNT(*) as PurchaseCount
FROM df
GROUP BY Date, ID")
或者plyr:
plyr::count(df, c('ID','Date'))
我个人更喜欢data.table它,因为它直接写入数据帧并且通常很省时。aggregate当您想避免加载新库时也是有利的。dplyr通常使您的代码更易读,因为它使用管道个人意见。