根据组中的其他变量为数据框中的每个组创建一个摘要行

发布于 2025-02-12 03:36:50 字数 832 浏览 1 评论 0原文

相当新的R，最终处于以下情况下：我想根据年和model 创建数据框中的每个组摘要行每行的值将基于一个value的变量的减法。

df <- data.frame(Model = c(1,1,1,2,2,2,2,2,2,2,2,2,2),
             Year = c(2020, 2020, 2020, 2020, 2020, 2020, 2020, 2030, 2030, 2030, 2040, 2040, 2040),
             Variable = c("A", "B", "C", "A", "B", "C", "D", "A", "C", "E", "A", "C", "D"),
             value = c(15, 2, 5, 25, 6, 4, 4, 41, 24,1, 15, 3, 2))

我设法为每个组创建了一个新行，因此它已经具有年和actible> variable名称，但我使用：

df <- df %>% group_by(Model, Year) %>% group_modify(~ add_row(., Variable = "New", .before=0))

但是，我正在努力创建一个我想从中计算value的方程式。

我想要拥有的而不是NAS： abd的值在每个组中

都将感谢任何帮助。我在这里的第一个线程，请赦免任何不便。

原文

Fairly new to R, ended up in the following situation: I want to create a summary row for each group in the dataframe based on Year and Model, where a value of each row would be based on the subtraction of value of one Variable from others in the group.

df <- data.frame(Model = c(1,1,1,2,2,2,2,2,2,2,2,2,2),
             Year = c(2020, 2020, 2020, 2020, 2020, 2020, 2020, 2030, 2030, 2030, 2040, 2040, 2040),
             Variable = c("A", "B", "C", "A", "B", "C", "D", "A", "C", "E", "A", "C", "D"),
             value = c(15, 2, 5, 25, 6, 4, 4, 41, 24,1, 15, 3, 2))

I have managed to create a new row for each group, so it already has a Year and a Variable name that I manually specified using:

df <- df %>% group_by(Model, Year) %>% group_modify(~ add_row(., Variable = "New", .before=0))

However, I am struggling to create an equation from which I want to calculate the value.

What I want to have instead of NAs: value of A-B-D in each group

Would appreciate any help. My first thread here, pardon for any inconvenience.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

鹿港小镇 2025-02-19 03:36:50

您可以枢转，然后返回；这将添加与零缺少的零的行添加行：

library(dplyr); library(tidyr)
df %>%
  pivot_wider(names_from = Variable, values_from = value, values_fill = 0) %>%
  mutate(new = A - B - D) %>%
  pivot_longer(-c(Model, Year), names_to = "Variable")


# A tibble: 24 × 4
   Model  Year Variable value
   <dbl> <dbl> <chr>    <dbl>
 1     1  2020 A           15   
 2     1  2020 B            2
 3     1  2020 C            5
 4     1  2020 D            0
 5     1  2020 E            0
 6     1  2020 new         13    # 15 - 2 - 0 = 13
 7     2  2020 A           25
 8     2  2020 B            6
 9     2  2020 C            4
10     2  2020 D            4
# … with 14 more rows

编辑 - 变化，其中我们留下缺失的值并使用cocece（x，0）允许减法将Na视为零。 pivot_wider在缺少的点中创建na，但是我们可以在pivot_longer中使用values_drop_na = true将它们排除在pivot_longer中。

df %>%
  pivot_wider(names_from = Variable, values_from = value) %>%
  mutate(new = A - coalesce(B,0) - coalesce(D,0)) %>%
  pivot_longer(-c(Model, Year), names_to = "Variable", values_drop_na = TRUE)

# A tibble: 17 × 4
   Model  Year Variable value
   <dbl> <dbl> <chr>    <dbl>
 1     1  2020 A           15
 2     1  2020 B            2
 3     1  2020 C            5
 4     1  2020 new         13
 5     2  2020 A           25
 6     2  2020 B            6
 7     2  2020 C            4
 8     2  2020 D            4
 9     2  2020 new         15
10     2  2030 A           41
11     2  2030 C           24
12     2  2030 E            1
13     2  2030 new         41
14     2  2040 A           15
15     2  2040 C            3
16     2  2040 D            2
17     2  2040 new         13

You could pivot wide and then back; this would add rows with zeros where missing:

library(dplyr); library(tidyr)
df %>%
  pivot_wider(names_from = Variable, values_from = value, values_fill = 0) %>%
  mutate(new = A - B - D) %>%
  pivot_longer(-c(Model, Year), names_to = "Variable")


# A tibble: 24 × 4
   Model  Year Variable value
   <dbl> <dbl> <chr>    <dbl>
 1     1  2020 A           15   
 2     1  2020 B            2
 3     1  2020 C            5
 4     1  2020 D            0
 5     1  2020 E            0
 6     1  2020 new         13    # 15 - 2 - 0 = 13
 7     2  2020 A           25
 8     2  2020 B            6
 9     2  2020 C            4
10     2  2020 D            4
# … with 14 more rows

EDIT - variation where we leave the missing values and use coalesce(x, 0) to allow subtraction to treat NA's as zeroes. The pivot_wider creates NA's in the missing spots, but we can exclude these in the pivot_longer using values_drop_na = TRUE.

df %>%
  pivot_wider(names_from = Variable, values_from = value) %>%
  mutate(new = A - coalesce(B,0) - coalesce(D,0)) %>%
  pivot_longer(-c(Model, Year), names_to = "Variable", values_drop_na = TRUE)

# A tibble: 17 × 4
   Model  Year Variable value
   <dbl> <dbl> <chr>    <dbl>
 1     1  2020 A           15
 2     1  2020 B            2
 3     1  2020 C            5
 4     1  2020 new         13
 5     2  2020 A           25
 6     2  2020 B            6
 7     2  2020 C            4
 8     2  2020 D            4
 9     2  2020 new         15
10     2  2030 A           41
11     2  2030 C           24
12     2  2030 E            1
13     2  2030 new         41
14     2  2040 A           15
15     2  2040 C            3
16     2  2040 D            2
17     2  2040 new         13

回复收藏 0 原文

~没有更多了~