在 r 中查找每个会话 id 的开始和结束时间

数据挖掘 r
2022-02-21 01:47:59

假设我有一个数据框

假设我有一个数据框

 > data
session id     timestamp                 item id 
  1         2014-04-0618:42:05.822         1 
  1         2014-04-0618:42:06.800         1
  1         2014-04-0618:42:06.820         1
  2         2014-04-0315:27:48.118         1
  2         2014-04-0315:27:49.440         2
  3         2014-04-0315:27:49.550         1
  3         2014-04-0315:27:50.240         0
  3         2014-04-0315:27:50.540         3
  3         2014-04-0315:27:51.530         2

我想找出每个会话的开始和结束时间以及每个会话中的唯一项目意味着我想要这样的输出

>  result   
session id   session start and end time                         distinctitems in each session  
 1           2014-04-0618:42:05.822, 2014-04-0618:42:06.820           1 
 2           2014-04-0315:27:48.118, 2014-04-0315:27:49.440           2
 3           2014-04-0315:27:49.550, 2014-04-0315:27:51.530           4

我怎样才能做到这一点?

1个回答

假设您可以访问 dplyr 包,您可以执行以下操作。

data %>% group_by(session_id) %>% summarise(start_time = min(timestamp), end_time = max(timestamp), unique_items=length(unique(item_id)))