按R中的每两行计算每两行的值的比例
我有这个数据集,
df <- tibble(id, event, duration)
我需要使用后续“表面”计算表面的持续时间比例,然后将结果插入新列中。所有这些都被“ ID”隔开。
比例= Surface/dive+Surface
#Output dataframe
# A tibble: 8 x 4
id event duration proportion
1 A surface 56 x
2 A surface 96 x
3 A surface 14 x
4 A surface 77 x
5 B surface 28 x
6 B surface 63 x
7 B surface 47 x
8 B surface 90 x
############################################################
编辑:
在我的原始数据中,我有一些“潜水”,没有“表面”,而创建的代码是错误的。
Error in `dplyr::mutate()`:
! Problem while computing `proportion = DurationMin[What ==
"Surface"]/sum(DurationMin)`.
✖ `proportion` must be size 2 or 1, not 0.
ℹ The error occurred in group 2803: ptt = "2017111870", grp = 1015.
在“ ID”内部会有奇数的行,其中“潜水”事件将在其顺序中没有“表面”。因此,我需要每次遇到未配对的事件时,都会忽略或插入NA。有可能吗?
请按照此数据框架示例:
id <- c("A", "A", "A", "A", "A", "A", "A", "A", "B", "B", "B", "B", "B", "B", "B")
event <- c("dive", "surface", "dive", "surface", "dive", "surface", "dive", "surface", "dive", "surface", "dive", "surface", "dive", "surface", "dive")
duration <- c(55, 56, 40, 96, 58, 14, 43, 77, 19, 28, 34, 63, 29, 47, 61)
df <- tibble(id, event, duration)
> df
id event duration
1 A dive 55
2 A surface 56
3 A dive 40
4 A surface 96
5 A dive 58
6 A surface 14
7 A dive 43
8 A surface 77
9 B dive 19
10 B surface 28
11 B dive 34
12 B surface 63
13 B dive 29
14 B surface 47
15 B dive 61
16 B dive 45
17 B surface 30
>
I have this dataset
df <- tibble(id, event, duration)
I need that the each "dive" row the duration proportion of surface be calculated using the subsequent "surface", and insert the result into a new column. All this separated by "id".
proportion = surface/dive+surface
#Output dataframe
# A tibble: 8 x 4
id event duration proportion
1 A surface 56 x
2 A surface 96 x
3 A surface 14 x
4 A surface 77 x
5 B surface 28 x
6 B surface 63 x
7 B surface 47 x
8 B surface 90 x
############################################################
Edit:
In my original data, i have some "dive" without "surface" and this code created is with error.
Error in `dplyr::mutate()`:
! Problem while computing `proportion = DurationMin[What ==
"Surface"]/sum(DurationMin)`.
✖ `proportion` must be size 2 or 1, not 0.
ℹ The error occurred in group 2803: ptt = "2017111870", grp = 1015.
Inside an 'id' there will be an odd number of rows, where a "dive" event will not have a "surface" in its sequence. So I need that every time an unpaired event is encountered, it is either ignored or an NA is inserted. It's possible?
Follow this dataframe example:
id <- c("A", "A", "A", "A", "A", "A", "A", "A", "B", "B", "B", "B", "B", "B", "B")
event <- c("dive", "surface", "dive", "surface", "dive", "surface", "dive", "surface", "dive", "surface", "dive", "surface", "dive", "surface", "dive")
duration <- c(55, 56, 40, 96, 58, 14, 43, 77, 19, 28, 34, 63, 29, 47, 61)
df <- tibble(id, event, duration)
> df
id event duration
1 A dive 55
2 A surface 56
3 A dive 40
4 A surface 96
5 A dive 58
6 A surface 14
7 A dive 43
8 A surface 77
9 B dive 19
10 B surface 28
11 B dive 34
12 B surface 63
13 B dive 29
14 B surface 47
15 B dive 61
16 B dive 45
17 B surface 30
>
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我们可以使用
gl
每2行创建分组索引,然后通过将事件值为“ surface”的“持续时间”来创建列“比例”(event =='Surface '
)带有sum
'duration' -输出
输出新数据集的
,我们可以使用-output
数据
We can use
gl
to create the grouping index every 2 rows, and then create the column 'proportion' by dividing the 'duration' where event value is 'surface' (event == 'surface'
) with thesum
of 'duration'-output
For the new dataset, we may use
-output
data