如何将一个和操作员嵌入数据中。表函数?
假设我们从立即下方的代码生成的此数据框开始:
> data1
ID Period Values_1 Values_2 State
1 1 1 5 5 X0
2 1 2 0 2 X1
3 1 3 0 0 X2
4 1 4 0 12 X1
5 2 1 1 2 X0
6 2 2 -1 0 X2
7 2 3 0 1 X0
8 2 4 0 0 X0
9 3 1 0 0 X2
10 3 2 0 0 X1
11 3 3 0 0 X9
12 3 4 0 2 X3
13 4 1 1 4 X2
14 4 2 2 5 X1
15 4 3 3 6 X9
16 4 4 0 0 X3
data1 <-
data.frame(
ID = c(1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4),
Period = c(1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4),
Values_1 = c(5, 0, 0, 0, 1, -1, 0, 0, 0, 0, 0, 0, 1, 2, 3, 0),
Values_2 = c(5, 2, 0, 12, 2, 0, 1, 0, 0, 0, 0, 2, 4, 5, 6, 0),
State = c("X0","X1","X2","X1","X0","X2","X0","X0", "X2","X1","X9","X3", "X2","X1","X9","X3")
)
我一直在使用此数据
setDT(data1)[, State1 := ifelse(rev(cumsum(rev(Values_1 + Values_2))), State, "END"), ID]
。结果:
> data1
ID Period Values_1 Values_2 State State1
1: 1 1 5 5 X0 X0
2: 1 2 0 2 X1 X1
3: 1 3 0 0 X2 X2
4: 1 4 0 12 X1 X1
5: 2 1 1 2 X0 X0
6: 2 2 -1 0 X2 END
7: 2 3 0 1 X0 X0
8: 2 4 0 0 X0 END
9: 3 1 0 0 X2 X2
10: 3 2 0 0 X1 X1
11: 3 3 0 0 X9 X9
12: 3 4 0 2 X3 X3
13: 4 1 1 4 X2 X2
14: 4 2 2 5 X1 X1
15: 4 3 3 6 X9 X9
16: 4 4 0 0 X3 END
当我想为ID = 2提供这些结果时:为此
> data1
ID Period Values_1 Values_2 State State1
1: 1 1 5 5 X0 X0
2: 1 2 0 2 X1 X1
3: 1 3 0 0 X2 X2
4: 1 4 0 12 X1 X1
5: 2 1 1 2 X0 X0
6: 2 2 -1 0 X2 X2
7: 2 3 0 1 X0 X0
8: 2 4 0 0 X0 END
9: 3 1 0 0 X2 X2
10: 3 2 0 0 X1 X1
11: 3 3 0 0 X9 X9
12: 3 4 0 2 X3 X3
13: 4 1 1 4 X2 X2
14: 4 2 2 5 X1 X1
15: 4 3 3 6 X9 X9
16: 4 4 0 0 X3 END
,我需要更改数据。值_1和值_2的ID的未来周期值(单独计算)= 0,然后在其所有未来周期中标记该ID的状态_1。如何在数据中完成。
setDT(data1)[, State1 := ifelse(rev(cumsum(rev(Values_1))) & rev(cumsum(rev(Values_2))), State, "END"), ID]
这与相关帖子链接如何使用dplyr或data。可通过数据子集执行浏览的计算?
Suppose we start with this dataframe generated by the code immediately beneath:
> data1
ID Period Values_1 Values_2 State
1 1 1 5 5 X0
2 1 2 0 2 X1
3 1 3 0 0 X2
4 1 4 0 12 X1
5 2 1 1 2 X0
6 2 2 -1 0 X2
7 2 3 0 1 X0
8 2 4 0 0 X0
9 3 1 0 0 X2
10 3 2 0 0 X1
11 3 3 0 0 X9
12 3 4 0 2 X3
13 4 1 1 4 X2
14 4 2 2 5 X1
15 4 3 3 6 X9
16 4 4 0 0 X3
data1 <-
data.frame(
ID = c(1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4),
Period = c(1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4),
Values_1 = c(5, 0, 0, 0, 1, -1, 0, 0, 0, 0, 0, 0, 1, 2, 3, 0),
Values_2 = c(5, 2, 0, 12, 2, 0, 1, 0, 0, 0, 0, 2, 4, 5, 6, 0),
State = c("X0","X1","X2","X1","X0","X2","X0","X0", "X2","X1","X9","X3", "X2","X1","X9","X3")
)
I've been using this data.table code to flag each ID with an "END" in State_1 when it no longer generates values in its future periods:
setDT(data1)[, State1 := ifelse(rev(cumsum(rev(Values_1 + Values_2))), State, "END"), ID]
The above code gives these results:
> data1
ID Period Values_1 Values_2 State State1
1: 1 1 5 5 X0 X0
2: 1 2 0 2 X1 X1
3: 1 3 0 0 X2 X2
4: 1 4 0 12 X1 X1
5: 2 1 1 2 X0 X0
6: 2 2 -1 0 X2 END
7: 2 3 0 1 X0 X0
8: 2 4 0 0 X0 END
9: 3 1 0 0 X2 X2
10: 3 2 0 0 X1 X1
11: 3 3 0 0 X9 X9
12: 3 4 0 2 X3 X3
13: 4 1 1 4 X2 X2
14: 4 2 2 5 X1 X1
15: 4 3 3 6 X9 X9
16: 4 4 0 0 X3 END
When I would like to give these results for ID = 2 instead:
> data1
ID Period Values_1 Values_2 State State1
1: 1 1 5 5 X0 X0
2: 1 2 0 2 X1 X1
3: 1 3 0 0 X2 X2
4: 1 4 0 12 X1 X1
5: 2 1 1 2 X0 X0
6: 2 2 -1 0 X2 X2
7: 2 3 0 1 X0 X0
8: 2 4 0 0 X0 END
9: 3 1 0 0 X2 X2
10: 3 2 0 0 X1 X1
11: 3 3 0 0 X9 X9
12: 3 4 0 2 X3 X3
13: 4 1 1 4 X2 X2
14: 4 2 2 5 X1 X1
15: 4 3 3 6 X9 X9
16: 4 4 0 0 X3 END
In order to do this, I need to change the data.table code to in effect something like the below (it doesn't work), where if BOTH the future period values for an ID of Values_1 AND Values_2 (separately calculated) = 0, then the State_1 for that ID is flagged END for all of its future periods. How can this be done in data.table?
setDT(data1)[, State1 := ifelse(rev(cumsum(rev(Values_1))) & rev(cumsum(rev(Values_2))), State, "END"), ID]
This is linked with related post How to use dplyr or data.table to perform look-ahead calculations by groups of data subsets?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
也许这样的东西:
输出:
Perhaps something like this:
Output:
链接帖子中的答案似乎是假设
values_1
和values_2
的非负值。如果有负,请将abs
插入data.table
表达式:The answer in the linked post appears to be assuming non-negative values for
Values_1
andValues_2
. If there are negatives, insert anabs
into thedata.table
expression: