减去行并创建新的行名称

发布于 2025-01-17 08:49:24 字数 2490 浏览 4 评论 0原文

我想在此数据框中从佛罗里达中减去贝县，并创建一个名为“佛罗里达（-贝县）”的新行。

也许 group_modify 和 add_row (dplyr) 是一种可能性？

year <- c(2005,2006,2007,2005,2006,2007,2005,2006,2007,2005,2006,2007)
county <- c("Alachua County","Alachua County","Alachua County","Baker County","Baker County","Baker County","Bay County","Bay County","Bay County","Florida","Florida","Florida")
pop <- c(3,6,8,9,8,4,5,8,10,17,22,22)
gdp <- c(3,6,8,9,8,4,5,8,10,17,22,22)
area <- c(3,6,8,9,8,4,5,8,10,17,22,22)
density<-c(3,6,8,9,8,4,5,8,10,17,22,22)
df <- data.frame(year, county,pop,gdp,area,density, stringsAsFactors = FALSE)

年份	县	人口	GDP	面积	密度
2005	阿拉楚阿县	3	3	3	3
2005	贝克县	9	9	9	9
2005	贝县	5	5	5	5
2005	佛罗里达州	17	17	17	17
2005	佛罗里达州（-贝县）	12	12	12	12
2006	阿拉楚阿县	6	6	6	6
2006	贝克县	8	8	8	8
2006	贝县	8	8	8	8
2006	佛罗里达州	22	22	22	22
2006	佛罗里达州（-贝县）	14	14 14	14
2007	阿拉楚阿县	8	8	8	8
2007	贝克县	4	4	4	4
2007	贝县	10	10	10	10
2007	佛罗里达州	22	22	22	22
2007	佛罗里达州（-贝县）	12	12	12	12

原文

I would like to subtract Bay County from Florida in this data frame and create a new row with the name "Florida (-Bay County)".

Maybe group_modify and add_row (dplyr) would be a possibility?

year <- c(2005,2006,2007,2005,2006,2007,2005,2006,2007,2005,2006,2007)
county <- c("Alachua County","Alachua County","Alachua County","Baker County","Baker County","Baker County","Bay County","Bay County","Bay County","Florida","Florida","Florida")
pop <- c(3,6,8,9,8,4,5,8,10,17,22,22)
gdp <- c(3,6,8,9,8,4,5,8,10,17,22,22)
area <- c(3,6,8,9,8,4,5,8,10,17,22,22)
density<-c(3,6,8,9,8,4,5,8,10,17,22,22)
df <- data.frame(year, county,pop,gdp,area,density, stringsAsFactors = FALSE)

year	county	pop	gdp	area	density
2005	Alachua County	3	3	3	3
2005	Baker County	9	9	9	9
2005	Bay County	5	5	5	5
2005	Florida	17	17	17	17
2005	Florida (-Bay County)	12	12	12	12
2006	Alachua County	6	6	6	6
2006	Baker County	8	8	8	8
2006	Bay County	8	8	8	8
2006	Florida	22	22	22	22
2006	Florida (-Bay County)	14	14	14	14
2007	Alachua County	8	8	8	8
2007	Baker County	4	4	4	4
2007	Bay County	10	10	10	10
2007	Florida	22	22	22	22
2007	Florida (-Bay County)	12	12	12	12

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

离鸿 2025-01-24 08:49:24

如果您想尝试使用group_modify和add_row尝试尝试，则可以考虑这样的东西。在这里，使用add_row，使用MAP sum 将组中的数据添加到组中，但不包括“佛罗里达州”或“ Bay County”。

library(tidyverse)

df %>%
  group_by(year) %>%
  group_modify(
    ~ .x %>%
      add_row(
        county = "Florida (-Bay County)",
        !!! map(.x %>% 
                  filter(!county %in% c("Florida", "Bay County")) %>%
                  select(-county),
                sum)
        )
  )

输出

    year county                  pop   gdp  area density
   <dbl> <chr>                 <dbl> <dbl> <dbl>   <dbl>
 1  2005 Alachua County            3     3     3       3
 2  2005 Baker County              9     9     9       9
 3  2005 Bay County                5     5     5       5
 4  2005 Florida                  17    17    17      17
 5  2005 Florida (-Bay County)    12    12    12      12
 6  2006 Alachua County            6     6     6       6
 7  2006 Baker County              8     8     8       8
 8  2006 Bay County                8     8     8       8
 9  2006 Florida                  22    22    22      22
10  2006 Florida (-Bay County)    14    14    14      14
11  2007 Alachua County            8     8     8       8
12  2007 Baker County              4     4     4       4
13  2007 Bay County               10    10    10      10
14  2007 Florida                  22    22    22      22
15  2007 Florida (-Bay County)    12    12    12      12

If you wanted to try something with group_modify and add_row, you could consider something like this. Here, when using add_row, use map to sum up the data within the group, but not including "Florida" or "Bay County".

library(tidyverse)

df %>%
  group_by(year) %>%
  group_modify(
    ~ .x %>%
      add_row(
        county = "Florida (-Bay County)",
        !!! map(.x %>% 
                  filter(!county %in% c("Florida", "Bay County")) %>%
                  select(-county),
                sum)
        )
  )

Output

    year county                  pop   gdp  area density
   <dbl> <chr>                 <dbl> <dbl> <dbl>   <dbl>
 1  2005 Alachua County            3     3     3       3
 2  2005 Baker County              9     9     9       9
 3  2005 Bay County                5     5     5       5
 4  2005 Florida                  17    17    17      17
 5  2005 Florida (-Bay County)    12    12    12      12
 6  2006 Alachua County            6     6     6       6
 7  2006 Baker County              8     8     8       8
 8  2006 Bay County                8     8     8       8
 9  2006 Florida                  22    22    22      22
10  2006 Florida (-Bay County)    14    14    14      14
11  2007 Alachua County            8     8     8       8
12  2007 Baker County              4     4     4       4
13  2007 Bay County               10    10    10      10
14  2007 Florida                  22    22    22      22
15  2007 Florida (-Bay County)    12    12    12      12

回复收藏 0 原文

┊风居住的梦幻卍 2025-01-24 08:49:24

你可以做：

df %>% 
  filter(county != 'Florida' & county != 'Bay County') %>%
  group_by(year) %>%
  bind_rows(summarise(., county = 'Florida (-Bay County)', 
                      across(where(is.numeric), sum))) %>%
  arrange(year)
#> # A tibble: 9 x 6
#> # Groups:   year [3]
#>    year county                  pop   gdp  area density
#>   <dbl> <chr>                 <dbl> <dbl> <dbl>   <dbl>
#> 1  2005 Alachua County            3     3     3       3
#> 2  2005 Baker County              9     9     9       9
#> 3  2005 Florida (-Bay County)    12    12    12      12
#> 4  2006 Alachua County            6     6     6       6
#> 5  2006 Baker County              8     8     8       8
#> 6  2006 Florida (-Bay County)    14    14    14      14
#> 7  2007 Alachua County            8     8     8       8
#> 8  2007 Baker County              4     4     4       4
#> 9  2007 Florida (-Bay County)    12    12    12      12

You could do:

df %>% 
  filter(county != 'Florida' & county != 'Bay County') %>%
  group_by(year) %>%
  bind_rows(summarise(., county = 'Florida (-Bay County)', 
                      across(where(is.numeric), sum))) %>%
  arrange(year)
#> # A tibble: 9 x 6
#> # Groups:   year [3]
#>    year county                  pop   gdp  area density
#>   <dbl> <chr>                 <dbl> <dbl> <dbl>   <dbl>
#> 1  2005 Alachua County            3     3     3       3
#> 2  2005 Baker County              9     9     9       9
#> 3  2005 Florida (-Bay County)    12    12    12      12
#> 4  2006 Alachua County            6     6     6       6
#> 5  2006 Baker County              8     8     8       8
#> 6  2006 Florida (-Bay County)    14    14    14      14
#> 7  2007 Alachua County            8     8     8       8
#> 8  2007 Baker County              4     4     4       4
#> 9  2007 Florida (-Bay County)    12    12    12      12

回复收藏 0 原文

~没有更多了~