计算r中的会话持续时间

发布于 2025-02-09 16:32:38 字数 3216 浏览 0 评论 0原文

我有一个带有会话ID的数据集,用户ID,UNIX中的TIMESTAMP(我使用lubridate进行了转换)和转换后的时间戳列。

会话用户TS_UNIXTIMESTAMP
123345UNIX TIMESTAMP14-06-2022
17:44:32 123345UNIX TIMESTAMP14-06-2022 17:44:33
123345UNIX TIMESTAMP14-06-2022 17:44:37
124346UNIX TIMESTAMP TIMESTAMP TIMESTAMP14-06-2022 15:50:10
124346UNIX TIMESTAMP14-06-2022 15:51:01
124346UNIX TIMESTAMP14-06-2022 16:30:00
125345UNIX TIMESTAMP14-06-06-2022 23:55 23:55 :30
125345UNIX时间戳14-06-2022 23:58:50
125345UNIX TIMESTAMP14-06-2022 23:59:45
125345UNIX TIMESTAMP15-06-06-2022
00:00:00:32 125 345UNIX TIMESTAMP15-- 06-2022 00:00:59

我想添加另一个称为session_duration(以秒为单位)的列,这是Max_time和按Session和用户分组的MAX_TIME和MIN_TIME之间的区别。例如,对于会话#123和用户345,会话持续时间为[14-06-2022 17:44:37] - [14-06-2022 17:44:32],是5秒。

会话用户TS_UNIXTIMESTAMPSESSION_DURATION(秒)
123345UNIX TIMESTAMP14-06-2022 17:44:325
5 123123 345UNIX TIMESTAMP14-06-06-2022 17:44:335
123345UNIXTIMESTAMP 14-06-2022 17:444-2022 17:4444444- :375
124346UNIX时间戳14-06-2022 15:50:102390
124346UNIX TIMESTAMP14-06-2022 15:51:01124
346UNIXTIMESTAMP239014-06-06-06-2022
16:0014-06-2022UNIX时间戳14-06-2022 23:55:30329
125345UNIX时间戳14-06-2022 23:58:50329
125345UNIX TIMESTAMP14-06-2022 23:59:45329
125 329 125345TIMESTAMPUNIX 15-066-06-066-0666-06-06-06-0666-06-06-06-06-06-06-06-06-06 -2022 00:00:32329
125345UNIX时间戳15-06-2022 00:00:59329

这是我当前的代码的样子。时间戳已成功转换,但是我面临着会话持续时间列的问题。

library(tidyverse)
library(lubridate)
df <- df %>%
  mutate(timestamp = as_datetime(ts_unix/1000)) %>%
  group_by (session, user, timestamp) %>%
  mutate(session_duration = difftime (max(timestamp), min(timestamp), units = "secs"))

有人可以帮我找出session_duration列吗?谢谢。

I have a dataset with a session id, user id, TimeStamp in UNIX (which I converted using lubridate), and the converted TimeStamp column.

SessionUserts_UNIXTimeStamp
123345UNIX Timestamp14-06-2022 17:44:32
123345UNIX Timestamp14-06-2022 17:44:33
123345UNIX Timestamp14-06-2022 17:44:37
124346UNIX Timestamp14-06-2022 15:50:10
124346UNIX Timestamp14-06-2022 15:51:01
124346UNIX Timestamp14-06-2022 16:30:00
125345UNIX Timestamp14-06-2022 23:55:30
125345UNIX Timestamp14-06-2022 23:58:50
125345UNIX Timestamp14-06-2022 23:59:45
125345UNIX Timestamp15-06-2022 00:00:32
125345UNIX Timestamp15-06-2022 00:00:59

I would like to add another column called session_duration (in seconds) which is the difference between the max_time and min_time grouped by Session and User. For instance, for session # 123 and user 345, the session duration is [14-06-2022 17:44:37] - [14-06-2022 17:44:32] which is 5 seconds.

SessionUserts_UNIXTimeStampsession_duration (seconds)
123345UNIX Timestamp14-06-2022 17:44:325
123345UNIX Timestamp14-06-2022 17:44:335
123345UNIX Timestamp14-06-2022 17:44:375
124346UNIX Timestamp14-06-2022 15:50:102390
124346UNIX Timestamp14-06-2022 15:51:012390
124346UNIX Timestamp14-06-2022 16:30:002390
125345UNIX Timestamp14-06-2022 23:55:30329
125345UNIX Timestamp14-06-2022 23:58:50329
125345UNIX Timestamp14-06-2022 23:59:45329
125345UNIX Timestamp15-06-2022 00:00:32329
125345UNIX Timestamp15-06-2022 00:00:59329

This is what my current code looks like. The timestamp has successfully converted, but I am facing an issue with the session duration column.

library(tidyverse)
library(lubridate)
df <- df %>%
  mutate(timestamp = as_datetime(ts_unix/1000)) %>%
  group_by (session, user, timestamp) %>%
  mutate(session_duration = difftime (max(timestamp), min(timestamp), units = "secs"))

Can someone please help me figure out the session_duration column? Thank you.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

一绘本一梦想 2025-02-16 16:32:38
library(data.table)
setDT(df)[, duration := max(TimeStamp) - min(TimeStamp), by = .(Session)][]

#    Session User        ts_UNIX           TimeStamp  duration
# 1:     123  345 UNIX Timestamp 2022-06-14 17:44:32    5 secs
# 2:     123  345 UNIX Timestamp 2022-06-14 17:44:33    5 secs
# 3:     123  345 UNIX Timestamp 2022-06-14 17:44:37    5 secs
# 4:     124  346 UNIX Timestamp 2022-06-14 15:50:10 2390 secs
# 5:     124  346 UNIX Timestamp 2022-06-14 15:51:01 2390 secs
# 6:     124  346 UNIX Timestamp 2022-06-14 16:30:00 2390 secs
# 7:     125  345 UNIX Timestamp 2022-06-14 23:55:30  329 secs
# 8:     125  345 UNIX Timestamp 2022-06-14 23:58:50  329 secs
# 9:     125  345 UNIX Timestamp 2022-06-14 23:59:45  329 secs
#10:     125  345 UNIX Timestamp 2022-06-15 00:00:32  329 secs
#11:     125  345 UNIX Timestamp 2022-06-15 00:00:59  329 secs

样本数据

df <- fread("Session    User    ts_UNIX     TimeStamp
123     345     UNIX Timestamp  14-06-2022T17:44:32
123     345     UNIX Timestamp  14-06-2022T17:44:33
123     345     UNIX Timestamp  14-06-2022T17:44:37
124     346     UNIX Timestamp  14-06-2022T15:50:10
124     346     UNIX Timestamp  14-06-2022T15:51:01
124     346     UNIX Timestamp  14-06-2022T16:30:00
125     345     UNIX Timestamp  14-06-2022T23:55:30
125     345     UNIX Timestamp  14-06-2022T23:58:50
125     345     UNIX Timestamp  14-06-2022T23:59:45
125     345     UNIX Timestamp  15-06-2022T00:00:32
125     345     UNIX Timestamp  15-06-2022T00:00:59")

df[, TimeStamp := as.POSIXct(TimeStamp, format= "%d-%m-%YT%H:%M:%S")]
library(data.table)
setDT(df)[, duration := max(TimeStamp) - min(TimeStamp), by = .(Session)][]

#    Session User        ts_UNIX           TimeStamp  duration
# 1:     123  345 UNIX Timestamp 2022-06-14 17:44:32    5 secs
# 2:     123  345 UNIX Timestamp 2022-06-14 17:44:33    5 secs
# 3:     123  345 UNIX Timestamp 2022-06-14 17:44:37    5 secs
# 4:     124  346 UNIX Timestamp 2022-06-14 15:50:10 2390 secs
# 5:     124  346 UNIX Timestamp 2022-06-14 15:51:01 2390 secs
# 6:     124  346 UNIX Timestamp 2022-06-14 16:30:00 2390 secs
# 7:     125  345 UNIX Timestamp 2022-06-14 23:55:30  329 secs
# 8:     125  345 UNIX Timestamp 2022-06-14 23:58:50  329 secs
# 9:     125  345 UNIX Timestamp 2022-06-14 23:59:45  329 secs
#10:     125  345 UNIX Timestamp 2022-06-15 00:00:32  329 secs
#11:     125  345 UNIX Timestamp 2022-06-15 00:00:59  329 secs

sample data

df <- fread("Session    User    ts_UNIX     TimeStamp
123     345     UNIX Timestamp  14-06-2022T17:44:32
123     345     UNIX Timestamp  14-06-2022T17:44:33
123     345     UNIX Timestamp  14-06-2022T17:44:37
124     346     UNIX Timestamp  14-06-2022T15:50:10
124     346     UNIX Timestamp  14-06-2022T15:51:01
124     346     UNIX Timestamp  14-06-2022T16:30:00
125     345     UNIX Timestamp  14-06-2022T23:55:30
125     345     UNIX Timestamp  14-06-2022T23:58:50
125     345     UNIX Timestamp  14-06-2022T23:59:45
125     345     UNIX Timestamp  15-06-2022T00:00:32
125     345     UNIX Timestamp  15-06-2022T00:00:59")

df[, TimeStamp := as.POSIXct(TimeStamp, format= "%d-%m-%YT%H:%M:%S")]
把昨日还给我 2025-02-16 16:32:38
library(tidyverse)
library(lubridate)

df %>% 
  group_by(Session, User) %>% 
  mutate(session_duration = max(TimeStamp) - min(TimeStamp))

# A tibble: 11 × 5
# Groups:   Session, User [3]
   Session  User ts_UNIX        TimeStamp           session_duration
     <dbl> <dbl> <chr>          <dttm>              <drtn>          
 1     123   345 UNIX Timestamp 2022-06-14 17:44:32    5 secs       
 2     123   345 UNIX Timestamp 2022-06-14 17:44:33    5 secs       
 3     123   345 UNIX Timestamp 2022-06-14 17:44:37    5 secs       
 4     124   346 UNIX Timestamp 2022-06-14 15:50:10 2390 secs       
 5     124   346 UNIX Timestamp 2022-06-14 15:51:01 2390 secs       
 6     124   346 UNIX Timestamp 2022-06-14 16:30:00 2390 secs       
 7     125   345 UNIX Timestamp 2022-06-14 23:55:30  329 secs       
 8     125   345 UNIX Timestamp 2022-06-14 23:58:50  329 secs       
 9     125   345 UNIX Timestamp 2022-06-14 23:59:45  329 secs       
10     125   345 UNIX Timestamp 2022-06-15 00:00:32  329 secs       
11     125   345 UNIX Timestamp 2022-06-15 00:00:59  329 secs       
library(tidyverse)
library(lubridate)

df %>% 
  group_by(Session, User) %>% 
  mutate(session_duration = max(TimeStamp) - min(TimeStamp))

# A tibble: 11 × 5
# Groups:   Session, User [3]
   Session  User ts_UNIX        TimeStamp           session_duration
     <dbl> <dbl> <chr>          <dttm>              <drtn>          
 1     123   345 UNIX Timestamp 2022-06-14 17:44:32    5 secs       
 2     123   345 UNIX Timestamp 2022-06-14 17:44:33    5 secs       
 3     123   345 UNIX Timestamp 2022-06-14 17:44:37    5 secs       
 4     124   346 UNIX Timestamp 2022-06-14 15:50:10 2390 secs       
 5     124   346 UNIX Timestamp 2022-06-14 15:51:01 2390 secs       
 6     124   346 UNIX Timestamp 2022-06-14 16:30:00 2390 secs       
 7     125   345 UNIX Timestamp 2022-06-14 23:55:30  329 secs       
 8     125   345 UNIX Timestamp 2022-06-14 23:58:50  329 secs       
 9     125   345 UNIX Timestamp 2022-06-14 23:59:45  329 secs       
10     125   345 UNIX Timestamp 2022-06-15 00:00:32  329 secs       
11     125   345 UNIX Timestamp 2022-06-15 00:00:59  329 secs       
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文