在r中使用突变()和跨()创建多个新列
我想同时对许多列进行以下计算,而它们则由ID分组:
df <- df %>%
group_by(Id) %>%
mutate("Flows.2018.04"= Assets.2018.04 -
(Assets.2018.03 * Returns.2018.04))
数据集需要资产列 。喜欢为每一个创建一个流列。
我知道我可以为每一列这样做这样的事情:
df <- df %>%
group_by(Id) %>%
mutate("Flows.2018.04"= Assets.2018.04 -
(Assets.2018.03 * Returns.2018.04)) %>%
mutate("Flows.2018.05"= Assets.2018.05 -
(Assets.2018.04 * Returns.2018.05))
但是,正如我想对50列以上的计算进行此计算,我希望有一种更优雅的方法。据我所知,在跨()函数的dplyr时,这应该是可能的,但是我无法弄清楚如何做到这一点。
我希望新列被命名为Flow.yyyy.mm,这使问题进一步复杂化。我认为实现这一目标的最简单方法可能是在创建列后简单地重命名。
我还考虑过将数据框架从宽格式转换为长格式以执行此计算,但是这对我来说似乎更加复杂。
关于实现预期结果的任何建议吗?
请根据要求找到下面的示例数据:
library(tidyverse)
df <- data.frame(
ID = c("6F55", "6F55", "ANE3", "ANE3", "6F55"),
Assets.2018.03 = c(5000, 3000, 5870, 4098 ,9878),
Assets.2018.04 = c(2345, 1926, 8563, 9373, 7432),
Assets.2018.05 = c(3459, 6933, 1533, 4556, 9855),
Returns.2018.04 = c(1.03, 0.77, 1.01, 0.97, 1.06),
Returns.2018.05 = c(0.94, 1.11, 0.89, 1.02, 1.02))
df
ID Assets.2018.03 Assets.2018.04 Assets.2018.05 Returns.2018.04 Returns.2018.05
1 6F55 5000 2345 3459 1.03 0.94
2 6F55 3000 1926 6933 0.77 1.11
3 ANE3 5870 8563 1533 1.01 0.89
4 ANE3 4098 9373 4556 0.97 1.02
5 6F55 9878 7432 9855 1.06 1.02
所需的结果是:
ID Assets.2018.03 Assets.2018.04 Assets.2018.05 Returns.2018.04 Returns.2018.05 Flows.2018.04 Flows.2018.05
1 6F55 5000 2345 3459 1.03 0.94 -2805 1255
2 6F55 3000 1926 6933 0.77 1.11 -384 4795
3 ANE3 5870 8563 1533 1.01 0.89 2634 -6088
4 ANE3 4098 9373 4556 0.97 1.02 5398 -5004
5 6F55 9878 7432 9855 1.06 1.02 -3039 2274
I would like to perform the following calculation on many columns at the same time while they are grouped by ID:
df <- df %>%
group_by(Id) %>%
mutate("Flows.2018.04"= Assets.2018.04 -
(Assets.2018.03 * Returns.2018.04))
The data set entails a column for Assets.YYYY.MM and Returns.YYYY.MM for each month from 2018.04 to 2022.02 and I would like to create a Flows column for each of those.
I know that I could do it like this for every column:
df <- df %>%
group_by(Id) %>%
mutate("Flows.2018.04"= Assets.2018.04 -
(Assets.2018.03 * Returns.2018.04)) %>%
mutate("Flows.2018.05"= Assets.2018.05 -
(Assets.2018.04 * Returns.2018.05))
But as I want to do this calculation for 50+ columns I was hoping there is a more elegant way. To my knowledge it should be possible with the dplyr across() function but I was not able to figure out how to do this.
I would like the new columns to be named Flows.YYYY.MM which complicates the issue further. I thought that the easiest way to achieve this might be to simply rename the columns after creating them.
I have also thought about converting the data frame from wide format to long format to perform this calculation, however this seemed even more complicated to me.
Any suggestions on achieving the desired outcome?
Please find below the sample data, as requested:
library(tidyverse)
df <- data.frame(
ID = c("6F55", "6F55", "ANE3", "ANE3", "6F55"),
Assets.2018.03 = c(5000, 3000, 5870, 4098 ,9878),
Assets.2018.04 = c(2345, 1926, 8563, 9373, 7432),
Assets.2018.05 = c(3459, 6933, 1533, 4556, 9855),
Returns.2018.04 = c(1.03, 0.77, 1.01, 0.97, 1.06),
Returns.2018.05 = c(0.94, 1.11, 0.89, 1.02, 1.02))
df
ID Assets.2018.03 Assets.2018.04 Assets.2018.05 Returns.2018.04 Returns.2018.05
1 6F55 5000 2345 3459 1.03 0.94
2 6F55 3000 1926 6933 0.77 1.11
3 ANE3 5870 8563 1533 1.01 0.89
4 ANE3 4098 9373 4556 0.97 1.02
5 6F55 9878 7432 9855 1.06 1.02
The desired outcome is:
ID Assets.2018.03 Assets.2018.04 Assets.2018.05 Returns.2018.04 Returns.2018.05 Flows.2018.04 Flows.2018.05
1 6F55 5000 2345 3459 1.03 0.94 -2805 1255
2 6F55 3000 1926 6933 0.77 1.11 -384 4795
3 ANE3 5870 8563 1533 1.01 0.89 2634 -6088
4 ANE3 4098 9373 4556 0.97 1.02 5398 -5004
5 6F55 9878 7432 9855 1.06 1.02 -3039 2274
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
怎么样:
由
How about this:
Created on 2022-04-10 by the reprex package (v2.0.1)