根据时间分配变量

发布于 2025-01-14 10:21:44 字数 1161 浏览 3 评论 0原文

我有一个时间序列,并且对添加新变量有特定要求。

这是一些数据

dput(df)
structure(list(Time = structure(c(1567423339.229, 1567423399.018, 
1567424218.867, 1567425478.666, 1567425498.883, 1567426519.008, 
1567429378.848, 1567429398.979, 1567429978.723, 1567431218.909
), tzone = "", class = c("POSIXct", "POSIXt")), RaceNum = c("1", 
"1", "1", "1", "1", "1", "2", "2", "2", "2")), class = "data.frame", row.names = c(NA, 
-10L))

我尝试使用 case_when 或 ifelse 做的事情是在 1:nrow 基础上分配 d ,除非下一个时间序列事件在 1 分钟内,然后它采用前一个变量并向其添加 b 。正如您所看到的,当 RaceNum 更改时,编号会重新开始。我通过 RaceNum 拆分 df,然后在建立 d 后将 cbind 重新组合在一起。

这是预期的结果

dput(df2)
structure(list(Time = structure(c(1567423339.229, 1567423399.018, 
1567424218.867, 1567425478.666, 1567425498.883, 1567426519.008, 
1567429378.848, 1567429398.979, 1567429978.723, 1567431218.909
), tzone = "", class = c("POSIXct", "POSIXt")), RaceNum = c("1", 
"1", "1", "1", "1", "1", "2", "2", "2", "2"), d = c("1", "1b", 
"2", "3", "3b", "4", "1", "1b", "2", "3")), class = "data.frame", row.names = c(NA, 
-10L))

I have a time series and I have a specific requirement for adding a new variable.

Here is some data

dput(df)
structure(list(Time = structure(c(1567423339.229, 1567423399.018, 
1567424218.867, 1567425478.666, 1567425498.883, 1567426519.008, 
1567429378.848, 1567429398.979, 1567429978.723, 1567431218.909
), tzone = "", class = c("POSIXct", "POSIXt")), RaceNum = c("1", 
"1", "1", "1", "1", "1", "2", "2", "2", "2")), class = "data.frame", row.names = c(NA, 
-10L))

What I have tried to do unseussesfully using case_when or ifelse is to assign d on a 1:nrow basis unless the next time series event is within 1 minute then it takes the previous variable and adds a b to it. As you can see the numbering starts again whne RaceNum changes. I was splitting the df by RaceNum then cbind back together once I had established d.

Here is the expected result

dput(df2)
structure(list(Time = structure(c(1567423339.229, 1567423399.018, 
1567424218.867, 1567425478.666, 1567425498.883, 1567426519.008, 
1567429378.848, 1567429398.979, 1567429978.723, 1567431218.909
), tzone = "", class = c("POSIXct", "POSIXt")), RaceNum = c("1", 
"1", "1", "1", "1", "1", "2", "2", "2", "2"), d = c("1", "1b", 
"2", "3", "3b", "4", "1", "1b", "2", "3")), class = "data.frame", row.names = c(NA, 
-10L))

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

撑一把青伞 2025-01-21 10:21:44
  • 为每个 RaceNum 创建一个变量,当连续记录之间的差异大于 1 分钟时,该变量会递增。
  • 对于每个组 (d),将字母粘贴到组编号中。
library(dplyr)

df %>%
  group_by(RaceNum) %>%
  mutate(d = cumsum(difftime(Time, lag(Time, default = first(Time)),
                    units = 'min') > 1) + 1) %>%
  group_by(d, .add = TRUE) %>%
  mutate(d = paste0(d, letters[row_number()]), 
         #For 1st row remove a from 1a, 2a etc. 
         d = ifelse(row_number() == 1, sub('a', '', d), d))  %>%
  ungroup

#   Time                RaceNum d    
#   <dttm>              <chr>   <chr>
# 1 2019-09-02 19:22:19 1       1    
# 2 2019-09-02 19:23:19 1       1b   
# 3 2019-09-02 19:36:58 1       2    
# 4 2019-09-02 19:57:58 1       3    
# 5 2019-09-02 19:58:18 1       3b   
# 6 2019-09-02 20:15:19 1       4    
# 7 2019-09-02 21:02:58 2       1    
# 8 2019-09-02 21:03:18 2       1b   
# 9 2019-09-02 21:12:58 2       2    
#10 2019-09-02 21:33:38 2       3    
  • For each RaceNum create a variable which increments when the difference between consecutive records is greater than 1 minute.
  • For each group (d) paste letters to group number.
library(dplyr)

df %>%
  group_by(RaceNum) %>%
  mutate(d = cumsum(difftime(Time, lag(Time, default = first(Time)),
                    units = 'min') > 1) + 1) %>%
  group_by(d, .add = TRUE) %>%
  mutate(d = paste0(d, letters[row_number()]), 
         #For 1st row remove a from 1a, 2a etc. 
         d = ifelse(row_number() == 1, sub('a', '', d), d))  %>%
  ungroup

#   Time                RaceNum d    
#   <dttm>              <chr>   <chr>
# 1 2019-09-02 19:22:19 1       1    
# 2 2019-09-02 19:23:19 1       1b   
# 3 2019-09-02 19:36:58 1       2    
# 4 2019-09-02 19:57:58 1       3    
# 5 2019-09-02 19:58:18 1       3b   
# 6 2019-09-02 20:15:19 1       4    
# 7 2019-09-02 21:02:58 2       1    
# 8 2019-09-02 21:03:18 2       1b   
# 9 2019-09-02 21:12:58 2       2    
#10 2019-09-02 21:33:38 2       3    
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文