如何将多行中的值汇总到R中的新列？

发布于 2025-01-29 05:53:25 字数 2604 浏览 1 评论 0原文

我的数据帧：

structure(list(Observation = c("Apple", "Blueberry", "Cirtus", 
"Dates", "Eggplant"), Topic = 1:5, Gamma = c(0.1, 0.1, 0.2, 0.2, 
0.1)), class = "data.frame", row.names = c(NA, -5L))

  Observation Topic Gamma
1       Apple     1   0.1
2   Blueberry     2   0.1
3      Cirtus     3   0.2
4       Dates     4   0.2
5    Eggplant     5   0.1

如何告诉R添加1、3和5和2和2和4的值，然后将其保存在新列中？例如：

观察	主题	gamma	新变量
苹果	1	.10	.40
蓝莓	2	.10 .10	.30
Cirtus	3	.20	.40
日期	4	.20 .20	.30
茄子	5	.10 .10	.40

本质上，我希望每个观察都有一个新的观察总结主题1、3和5的伽马评分以及主题2和4的价值。

更新：澄清： 我并不是要添加主题号码或奇数主题号码。有时，这两者都是混合的。请参阅此新表作为一个示例：

观察	主题	gamma	新变量
苹果	1	.10	.10
蓝莓	2	.10	.70
cirtus	3	.20 .20	.40
日期	为4	.20 .20	.40
茄子	5	.10 .10	.70
果	6	.50 .50	.70

in这个示例，我独自离开主题1，添加了主题2、5和6，并添加了主题3和4。

更新：澄清：

观察	主题	gamma gamma gamma	新变量
Apple	1	.10	.10
Apple	2	。 10	.70
Apple	3	.20	.40
Apple	4	.20	.40
Apple	5	.10	.70
Apple	6	.50	.70
蓝莓	1	.20 .20	.20
蓝莓	2	.10	.60
蓝莓	3	.30	.80
蓝莓	4	.50	。 80
蓝莓	5	.40	.60
蓝莓	6	.10	.60

在此示例中，每个果实（观察）都有自己的每个主题值集，我总结了与上面列出的相同的主题（2、5和6、3和3和4）每种水果。

原文

My dataframe:

structure(list(Observation = c("Apple", "Blueberry", "Cirtus", 
"Dates", "Eggplant"), Topic = 1:5, Gamma = c(0.1, 0.1, 0.2, 0.2, 
0.1)), class = "data.frame", row.names = c(NA, -5L))

  Observation Topic Gamma
1       Apple     1   0.1
2   Blueberry     2   0.1
3      Cirtus     3   0.2
4       Dates     4   0.2
5    Eggplant     5   0.1

How can I tell R to add the values of 1, 3, and 5, and 2 and 4, and then save it in a new column? For example:

Observation	Topic	Gamma	new variable
Apple	1	.10	.40
Blueberry	2	.10	.30
Cirtus	3	.20	.40
Dates	4	.20	.30
Eggplant	5	.10	.40

Essentially, I'd like each observation to have a new value that sums up the gamma scores of topics 1, 3, and 5, as well as topics 2 and 4.

Update: Clarification:
I am not trying to add even topic numbers or odd topic numbers. Sometimes it will be a mixture of both. See this new table as an example:

Observation	Topic	Gamma	new variable
Apple	1	.10	.10
Blueberry	2	.10	.70
Cirtus	3	.20	.40
Dates	4	.20	.40
Eggplant	5	.10	.70
Fruits	6	.50	.70

In this example, I left topic 1 alone, added topics 2, 5, and 6, and added topics 3 and 4.

Update: Clarification:

Observation	Topic	Gamma	new variable
Apple	1	.10	.10
Apple	2	.10	.70
Apple	3	.20	.40
Apple	4	.20	.40
Apple	5	.10	.70
Apple	6	.50	.70
Blueberry	1	.20	.20
Blueberry	2	.10	.60
Blueberry	3	.30	.80
Blueberry	4	.50	.80
Blueberry	5	.40	.60
Blueberry	6	.10	.60

In this example, Each fruit (observation) has their own set of values for each topic and I summed the same topics as listed above (2, 5, and 6, 3 and 4) per fruit.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

淡看悲欢离合 2025-02-05 05:53:25

更新II 在新请求中：

library(dplyr)

df %>% 
  group_by(Observation, grp = case_when(Topic %in% 1 ~ 1,
                           Topic %in% c(2,5,6) ~ 2,
                           Topic %in% c(3,4) ~ 3)) %>% 
  mutate(new_variable = sum(Gamma)) %>% 
  ungroup %>% 
  select(-grp)

  Observation Topic Gamma new_variable
   <chr>       <int> <dbl>        <dbl>
 1 Apple           1   0.1          0.1
 2 Apple           2   0.1          0.7
 3 Apple           3   0.2          0.4
 4 Apple           4   0.2          0.4
 5 Apple           5   0.1          0.7
 6 Apple           6   0.5          0.7
 7 Blueberry       1   0.2          0.2
 8 Blueberry       2   0.1          0.6
 9 Blueberry       3   0.3          0.8
10 Blueberry       4   0.5          0.8
11 Blueberry       5   0.4          0.6
12 Blueberry       6   0.1          0.6

更新： op的新请求。该解决方案的灵感来自Pauls解决方案（信用他）：

library(dplyr)

df %>% 
  group_by(grp = case_when(Topic %in% 1 ~ 1,
                           Topic %in% c(2,5,6) ~ 2,
                           Topic %in% c(3,4) ~ 3)) %>% 
  mutate(new_variable = sum(Gamma)) %>% 
  ungroup %>% 
  select(-grp)

  Observation Topic Gamma new_variable
  <chr>       <int> <dbl>        <dbl>
1 Apple           1   0.1          0.1
2 Blueberry       2   0.1          0.7
3 Cirtus          3   0.2          0.4
4 Dates           4   0.2          0.4
5 Eggplant        5   0.1          0.7
6 Fruits          6   0.5          0.7

第一个答案：
我们可以在IFELSE语句中识别奇数甚至行之后总和gamma：
这种

library(dplyr)

df %>% 
  mutate(new_variable = ifelse(row_number() %% 2 == 1, 
                               sum(Gamma[row_number() %% 2 == 1]), # odd 1,3,5
                               sum(Gamma[row_number() %% 2 == 0])) # even 2,4
         )

  Observation Topic Gamma new_variable
1       Apple     1   0.1          0.4
2   Blueberry     2   0.1          0.3
3      Cirtus     3   0.2          0.4
4       Dates     4   0.2          0.3
5    Eggplant     5   0.1          0.4

情况

structure(list(Observation = c("Apple", "Blueberry", "Cirtus", 
"Dates", "Eggplant"), Topic = 1:5, Gamma = c(0.1, 0.1, 0.2, 0.2, 
0.1)), class = "data.frame", row.names = c(NA, -5L))

上

在 “ rel =“ nofollow noreferrer”>

Update II on new request:

library(dplyr)

df %>% 
  group_by(Observation, grp = case_when(Topic %in% 1 ~ 1,
                           Topic %in% c(2,5,6) ~ 2,
                           Topic %in% c(3,4) ~ 3)) %>% 
  mutate(new_variable = sum(Gamma)) %>% 
  ungroup %>% 
  select(-grp)

  Observation Topic Gamma new_variable
   <chr>       <int> <dbl>        <dbl>
 1 Apple           1   0.1          0.1
 2 Apple           2   0.1          0.7
 3 Apple           3   0.2          0.4
 4 Apple           4   0.2          0.4
 5 Apple           5   0.1          0.7
 6 Apple           6   0.5          0.7
 7 Blueberry       1   0.2          0.2
 8 Blueberry       2   0.1          0.6
 9 Blueberry       3   0.3          0.8
10 Blueberry       4   0.5          0.8
11 Blueberry       5   0.4          0.6
12 Blueberry       6   0.1          0.6

Update: on new request of OP. This solution is inspired fully by PaulS solution (credits to him):

library(dplyr)

df %>% 
  group_by(grp = case_when(Topic %in% 1 ~ 1,
                           Topic %in% c(2,5,6) ~ 2,
                           Topic %in% c(3,4) ~ 3)) %>% 
  mutate(new_variable = sum(Gamma)) %>% 
  ungroup %>% 
  select(-grp)

  Observation Topic Gamma new_variable
  <chr>       <int> <dbl>        <dbl>
1 Apple           1   0.1          0.1
2 Blueberry       2   0.1          0.7
3 Cirtus          3   0.2          0.4
4 Dates           4   0.2          0.4
5 Eggplant        5   0.1          0.7
6 Fruits          6   0.5          0.7

First answer:
We could sum Gamma after identifying odd and even rows in an ifelse statement:
In this case row_number could be replaced by Topic

library(dplyr)

df %>% 
  mutate(new_variable = ifelse(row_number() %% 2 == 1, 
                               sum(Gamma[row_number() %% 2 == 1]), # odd 1,3,5
                               sum(Gamma[row_number() %% 2 == 0])) # even 2,4
         )

  Observation Topic Gamma new_variable
1       Apple     1   0.1          0.4
2   Blueberry     2   0.1          0.3
3      Cirtus     3   0.2          0.4
4       Dates     4   0.2          0.3
5    Eggplant     5   0.1          0.4

data:

structure(list(Observation = c("Apple", "Blueberry", "Cirtus", 
"Dates", "Eggplant"), Topic = 1:5, Gamma = c(0.1, 0.1, 0.2, 0.2, 
0.1)), class = "data.frame", row.names = c(NA, -5L))

Microbenchmark: AndrewGB's base R is fastest

回复收藏 0 原文

執念 2025-02-05 05:53:25

这应该做到。

dat <- structure(list(Observation = c("Apple", "Blueberry", "Cirtus", 
                                 "Dates", "Eggplant"), 
                 Topic = 1:5, Gamma = c(0.1, 0.1, 0.2, 0.2, 0.1)), 
            row.names = c(NA, 5L), class = "data.frame")
library(tidyverse)
dat %>% 
  mutate(even = as.numeric(Topic %% 2 == 0)) %>% 
  group_by(even) %>% 
  mutate(new_variable = sum(Gamma))
#> # A tibble: 5 × 5
#> # Groups:   even [2]
#>   Observation Topic Gamma  even new_variable
#>   <chr>       <int> <dbl> <dbl>        <dbl>
#> 1 Apple           1   0.1     0          0.4
#> 2 Blueberry       2   0.1     1          0.3
#> 3 Cirtus          3   0.2     0          0.4
#> 4 Dates           4   0.2     1          0.3
#> 5 Eggplant        5   0.1     0          0.4

^由

This should do it.

dat <- structure(list(Observation = c("Apple", "Blueberry", "Cirtus", 
                                 "Dates", "Eggplant"), 
                 Topic = 1:5, Gamma = c(0.1, 0.1, 0.2, 0.2, 0.1)), 
            row.names = c(NA, 5L), class = "data.frame")
library(tidyverse)
dat %>% 
  mutate(even = as.numeric(Topic %% 2 == 0)) %>% 
  group_by(even) %>% 
  mutate(new_variable = sum(Gamma))
#> # A tibble: 5 × 5
#> # Groups:   even [2]
#>   Observation Topic Gamma  even new_variable
#>   <chr>       <int> <dbl> <dbl>        <dbl>
#> 1 Apple           1   0.1     0          0.4
#> 2 Blueberry       2   0.1     1          0.3
#> 3 Cirtus          3   0.2     0          0.4
#> 4 Dates           4   0.2     1          0.3
#> 5 Eggplant        5   0.1     0          0.4

^{Created on 2022-05-13 by the reprex package (v2.0.1)}

回复收藏 0 原文

想念有你 2025-02-05 05:53:25

另一个可能的解决方案：

library(dplyr)

df %>% 
  group_by(grp = if_else(Topic %in% c(1, 3, 5), 1, 2)) %>% 
  mutate(new_variable = sum(Gamma)) %>% 
  ungroup %>% 
  select(-grp)

#> # A tibble: 5 × 4
#>   Observation Topic Gamma new_variable
#>   <chr>       <int> <dbl>        <dbl>
#> 1 Apple           1   0.1          0.4
#> 2 Blueberry       2   0.1          0.3
#> 3 Cirtus          3   0.2          0.4
#> 4 Dates           4   0.2          0.3
#> 5 Eggplant        5   0.1          0.4

Another possible solution:

library(dplyr)

df %>% 
  group_by(grp = if_else(Topic %in% c(1, 3, 5), 1, 2)) %>% 
  mutate(new_variable = sum(Gamma)) %>% 
  ungroup %>% 
  select(-grp)

#> # A tibble: 5 × 4
#>   Observation Topic Gamma new_variable
#>   <chr>       <int> <dbl>        <dbl>
#> 1 Apple           1   0.1          0.4
#> 2 Blueberry       2   0.1          0.3
#> 3 Cirtus          3   0.2          0.4
#> 4 Dates           4   0.2          0.3
#> 5 Eggplant        5   0.1          0.4

回复收藏 0 原文

一桥轻雨一伞开 2025-02-05 05:53:25

更新II （但也将与第一个更新一起使用）

，我们可以首先创建一个新的分组列，在其中我们将topic> topic列复制为因素我们可以根据要组合在一起的行来更改级别。然后，我们可以通过topip>主题和行组获得gamma列的总和。然后，删除grp列。

df$grp <- factor(df$Topic)

levels(df$grp) <- list(
  "1" = 1,
  "2" = c(2,5,6),
  "3" = c(3,4)
)

df$new_variable <- ave(df$Gamma, df[,c(1,4)], FUN = sum)

df <- df[,-4]

output

   Observation Topic Gamma new_variable
1        Apple     1   0.1          0.1
2        Apple     2   0.1          0.7
3        Apple     3   0.2          0.4
4        Apple     4   0.2          0.4
5        Apple     5   0.1          0.7
6        Apple     6   0.5          0.7
7    Blueberry     1   0.2          0.2
8    Blueberry     2   0.1          0.6
9    Blueberry     3   0.3          0.8
10   Blueberry     4   0.5          0.8
11   Blueberry     5   0.4          0.6
12   Blueberry     6   0.1          0.6

data

df <- structure(list(Observation = c("Apple", "Apple", "Apple", "Apple", 
"Apple", "Apple", "Blueberry", "Blueberry", "Blueberry", "Blueberry", 
"Blueberry", "Blueberry"), Topic = c(1L, 2L, 3L, 4L, 5L, 6L, 
1L, 2L, 3L, 4L, 5L, 6L), Gamma = c(0.1, 0.1, 0.2, 0.2, 0.1, 0.5, 
0.2, 0.1, 0.3, 0.5, 0.4, 0.1)), class = "data.frame", row.names = c(NA, 
-12L))

第一个答案

使用base r，我们可以使用ave获取每个组的总和。在这里，我使用逻辑创建组，因为我们只有2个组。

df$new_variable <- ave(df$Gamma, row.names(df) %in% c(1, 3, 5), FUN=sum)

输出

  Observation Topic Gamma new_variable
1       Apple     1   0.1          0.4
2   Blueberry     2   0.1          0.3
3      Cirtus     3   0.2          0.4
4       Dates     4   0.2          0.3
5    Eggplant     5   0.1          0.4

，否则我们可以获得每个行分组的总和，并通过索引分配给新列。

df$new_variable[c(1, 3, 5)] <- sum(df$Gamma[c(1, 3, 5)], na.rm = T)
df$new_variable[c(2, 4)] <- sum(df$Gamma[c(2, 4)], na.rm = T)

Update II (but will work with the first update as well)

With base R, we can first create a new grouping column, where we copy the Topic column as factor, then we can change the levels according to what rows you want to group together to sum. Then, we can get the sum of the Gamma column by the Topic and row groups. Then, remove the grp column.

df$grp <- factor(df$Topic)

levels(df$grp) <- list(
  "1" = 1,
  "2" = c(2,5,6),
  "3" = c(3,4)
)

df$new_variable <- ave(df$Gamma, df[,c(1,4)], FUN = sum)

df <- df[,-4]

Output

   Observation Topic Gamma new_variable
1        Apple     1   0.1          0.1
2        Apple     2   0.1          0.7
3        Apple     3   0.2          0.4
4        Apple     4   0.2          0.4
5        Apple     5   0.1          0.7
6        Apple     6   0.5          0.7
7    Blueberry     1   0.2          0.2
8    Blueberry     2   0.1          0.6
9    Blueberry     3   0.3          0.8
10   Blueberry     4   0.5          0.8
11   Blueberry     5   0.4          0.6
12   Blueberry     6   0.1          0.6

Data

df <- structure(list(Observation = c("Apple", "Apple", "Apple", "Apple", 
"Apple", "Apple", "Blueberry", "Blueberry", "Blueberry", "Blueberry", 
"Blueberry", "Blueberry"), Topic = c(1L, 2L, 3L, 4L, 5L, 6L, 
1L, 2L, 3L, 4L, 5L, 6L), Gamma = c(0.1, 0.1, 0.2, 0.2, 0.1, 0.5, 
0.2, 0.1, 0.3, 0.5, 0.4, 0.1)), class = "data.frame", row.names = c(NA, 
-12L))

First Answer

With base R, we can use ave to get the sum for each group. Here, I create the group using a logical since we only have 2 groups.

df$new_variable <- ave(df$Gamma, row.names(df) %in% c(1, 3, 5), FUN=sum)

Output

  Observation Topic Gamma new_variable
1       Apple     1   0.1          0.4
2   Blueberry     2   0.1          0.3
3      Cirtus     3   0.2          0.4
4       Dates     4   0.2          0.3
5    Eggplant     5   0.1          0.4

Or we could get the sum for each grouping of rows and assign to a new column by index.

df$new_variable[c(1, 3, 5)] <- sum(df$Gamma[c(1, 3, 5)], na.rm = T)
df$new_variable[c(2, 4)] <- sum(df$Gamma[c(2, 4)], na.rm = T)

回复收藏 0 原文

~没有更多了~

关于作者

素年丶

暂无简介

文章

25 人气

关注发私信

友情链接

文江博客

如何将多行中的值汇总到R中的新列？

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（4）

关于作者

相关话题

热门标签

推荐作者

牛↙奶布丁

COSO

落叶

暗地喜欢

qq_i8qOEG

qq_Wl4Sbi

友情链接

如何将多行中的值汇总到R中的新列？

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（4）

关于作者

相关话题

热门标签

推荐作者

牛↙奶布丁

COSO

落叶

暗地喜欢

qq_i8qOEG

qq_Wl4Sbi

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。