使用r来计算二进制输出= 1以来的时间

发布于 2025-01-26 04:35:19 字数 640 浏览 4 评论 0 原文

我在具有时间功能的数据范围内有二进制数据,我希望在下面的“ = 1”以来的新列产生类似的数据框架。我能够找到相当于这个答案的python 。我正在寻找一种在r中做到这一点的方法

Binary Output   Time (secs)   duration since =1
0               0             0
0               0.000983      0.000983
0               0.001966      0.001966
1               0.002949      0
0               0.003932      0.000983  # (0.003932-0.002949)
0               0.005000      0.002051  # (0.005000-0.002949)

I have binary data in a dataframe with a time feature and I'm looking to produce a dataframe like below with a new column "duration since =1". I was able to find the python equivalent of this answer here. I am looking for a way to do this in R

Binary Output   Time (secs)   duration since =1
0               0             0
0               0.000983      0.000983
0               0.001966      0.001966
1               0.002949      0
0               0.003932      0.000983  # (0.003932-0.002949)
0               0.005000      0.002051  # (0.005000-0.002949)

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

习ぎ惯性依靠 2025-02-02 04:35:19

我们可以使用 cumsum 指示我们是否应该使用 time 使用 binary_output == 1。如果 cumsum == 0 ,这意味着所有以前的 binary_output 的值为0,并且我们不会在这些行中使用 binarts_output == 1减去 time

library(dplyr)

df <- read.table(header = T, text = "Binary_Output   Time
0               0
0               0.000983
0               0.001966
1               0.002949
0               0.003932
0               0.005000")

df %>% 
  mutate(duration = ifelse(cumsum(Binary_Output) == 0, Time, Time - Time[Binary_Output == 1]))

#>   Binary_Output     Time duration
#> 1             0 0.000000 0.000000
#> 2             0 0.000983 0.000983
#> 3             0 0.001966 0.001966
#> 4             1 0.002949 0.000000
#> 5             0 0.003932 0.000983
#> 6             0 0.005000 0.002051

We can use cumsum to indicate whether we should subtract Time with Binary_Output == 1. If cumsum == 0, it means all previous Binary_Output has a value of 0, and we will not subtract Time with Binary_Output == 1 in these rows.

library(dplyr)

df <- read.table(header = T, text = "Binary_Output   Time
0               0
0               0.000983
0               0.001966
1               0.002949
0               0.003932
0               0.005000")

df %>% 
  mutate(duration = ifelse(cumsum(Binary_Output) == 0, Time, Time - Time[Binary_Output == 1]))

#>   Binary_Output     Time duration
#> 1             0 0.000000 0.000000
#> 2             0 0.000983 0.000983
#> 3             0 0.001966 0.001966
#> 4             1 0.002949 0.000000
#> 5             0 0.003932 0.000983
#> 6             0 0.005000 0.002051

Created on 2022-05-05 by the reprex package (v2.0.1)

冷…雨湿花 2025-02-02 04:35:19

使用 data.table

library(data.table)
setDT(df)

df[,DurationSince1:=Time-nafill(fifelse(Binary_Output==1,Time,NA),type = 'locf')][]

   Binary_Output     Time DurationSince1
           <int>    <num>          <num>
1:             0 0.000000             NA
2:             0 0.000983             NA
3:             0 0.001966             NA
4:             1 0.002949       0.000000
5:             0 0.003932       0.000983
6:             0 0.005000       0.002051

With data.table:

library(data.table)
setDT(df)

df[,DurationSince1:=Time-nafill(fifelse(Binary_Output==1,Time,NA),type = 'locf')][]

   Binary_Output     Time DurationSince1
           <int>    <num>          <num>
1:             0 0.000000             NA
2:             0 0.000983             NA
3:             0 0.001966             NA
4:             1 0.002949       0.000000
5:             0 0.003932       0.000983
6:             0 0.005000       0.002051
最佳男配角 2025-02-02 04:35:19

另外,这可以通过在 cumsum(binary_output)上进行分组来解决,该有益于复制第一组的OP的预期结果,即前3行:

library(data.table)
setDT(df)[, duration_since_1 := Time - first(Time), by = cumsum(Binary_Output)][]
  binary_output时间持续时间_since_1
1:0 0.000000 0.000000
2:0 0.000983 0.000983
3:0 0.001966 0.001966
4:1 0.002949 0.000000
5:0 0.003932 0.000983
6:0 0.005000 0.002051
 

数据

library(data.table)
df <- fread("Binary_Output   Time   duration_since_1
0               0             0
0               0.000983      0.000983
0               0.001966      0.001966
1               0.002949      0
0               0.003932      0.000983
0               0.005000      0.002051")

Alternatively, this can be solved by grouping on cumsum(Binary_Output) which has the benefit to reproduce OP's expected result for the first group, i.e., the first 3 rows:

library(data.table)
setDT(df)[, duration_since_1 := Time - first(Time), by = cumsum(Binary_Output)][]
   Binary_Output     Time duration_since_1
1:             0 0.000000         0.000000
2:             0 0.000983         0.000983
3:             0 0.001966         0.001966
4:             1 0.002949         0.000000
5:             0 0.003932         0.000983
6:             0 0.005000         0.002051

Data

library(data.table)
df <- fread("Binary_Output   Time   duration_since_1
0               0             0
0               0.000983      0.000983
0               0.001966      0.001966
1               0.002949      0
0               0.003932      0.000983
0               0.005000      0.002051")
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文