group_by＆amp;总结5个数据范围中的4个工作

发布于 2025-02-06 18:12:46 字数 693 浏览 3 评论 0原文

我正在应用以下功能来为每个选区找到最受欢迎的方：

election <- elec_df %>% dplyr::filter(Election == 2017  & WKR_NR <= 299) #%>%
  dplyr::group_by(WKR_NR, Partei) %>%
  summarise(
    Anteil_Stimmen = Stimmen/Total_Erststimmen, 
    Max_Partei = max(Anteil_Stimmen, na.rm=TRUE)) %>%
    dplyr::filter(Max_Partei == max(Max_Partei, na.rm=TRUE))

该代码在过滤几年时效果很好：2005，2009，2013，但未能分组并总结2017年

2013年

2017年失败

因此，我认为问题必须与2017年独有的政党变量有关。但是，我找不到错误。

可以找到数据集

？高度赞赏。谢谢你：）

原文

I am applying the following function to find the most popular party for each constituency:

election <- elec_df %>% dplyr::filter(Election == 2017  & WKR_NR <= 299) #%>%
  dplyr::group_by(WKR_NR, Partei) %>%
  summarise(
    Anteil_Stimmen = Stimmen/Total_Erststimmen, 
    Max_Partei = max(Anteil_Stimmen, na.rm=TRUE)) %>%
    dplyr::filter(Max_Partei == max(Max_Partei, na.rm=TRUE))

The code works fine when filtering for the years: 2005, 2009, 2013, but fails to group and summarize for the year 2017

Works fine for year 2013

Fails for year2017

I thus assume that the problem must be related to the Party variable that is exclusive to the year 2017. However, I cannot find the mistake.

The data set can be found here

Any kind of hint is highly appreciated. Thank you:)

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

青丝拂面 2025-02-13 18:12:46

更新7/11/22：

如果从提供的链接中下载了原始数据，请阅读（即使用read.csv，readr :: Read_csv，或<代码> data.table :: fread ），并分配给df，然后可以使用以下整洁管道来获得每次选举的299行：

df %>% 
  rename(WKR_NR = Wahlkreis) %>% 
  filter(WKR_NR<=299) %>% 
  group_by(Election, WKR_NR, Partei) %>%
  summarise(Anteil_Stimmen = Stimmen/Total_Erststimmen, .groups="drop_last") %>% 
  slice_max(Anteil_Stimmen)

如果您想通过选举拆分（IE）到年度）您可以将其添加到管道的末尾：

  ... %>%
  group_by(Election) %>% 
  group_split()

输出：

[[1]]
# A tibble: 299 x 4
   Election WKR_NR Partei Anteil_Stimmen
      <int>  <int> <chr>           <dbl>
 1     2002      1 SPD             0.487
 2     2002      2 CDU             0.443
 3     2002      3 SPD             0.458
 4     2002      4 SPD             0.481
 5     2002      5 SPD             0.537
 6     2002      6 SPD             0.479
 7     2002      7 SPD             0.464
 8     2002      8 SPD             0.467
 9     2002      9 SPD             0.485
10     2002     10 SPD             0.461
# ... with 289 more rows

[[2]]
# A tibble: 299 x 4
   Election WKR_NR Partei Anteil_Stimmen
      <int>  <int> <chr>           <dbl>
 1     2005      1 SPD             0.442
 2     2005      2 CDU             0.479
 3     2005      3 CDU             0.449
 4     2005      4 CDU             0.441
 5     2005      5 SPD             0.507
 6     2005      6 SPD             0.470
 7     2005      7 CDU             0.442
 8     2005      8 CDU             0.439
 9     2005      9 SPD             0.446
10     2005     10 CDU             0.444
# ... with 289 more rows

[[3]]
# A tibble: 299 x 4
   Election WKR_NR Partei Anteil_Stimmen
      <int>  <int> <chr>           <dbl>
 1     2009      1 CDU             0.388
 2     2009      2 CDU             0.432
 3     2009      3 CDU             0.398
 4     2009      4 CDU             0.402
 5     2009      5 SPD             0.383
 6     2009      6 CDU             0.386
 7     2009      7 CDU             0.408
 8     2009      8 CDU             0.398
 9     2009      9 CDU             0.386
10     2009     10 CDU             0.399
# ... with 289 more rows

[[4]]
# A tibble: 299 x 4
   Election WKR_NR Partei Anteil_Stimmen
      <int>  <int> <chr>           <dbl>
 1     2013      1 CDU             0.425
 2     2013      2 CDU             0.498
 3     2013      3 CDU             0.454
 4     2013      4 CDU             0.452
 5     2013      5 SPD             0.430
 6     2013      6 CDU             0.437
 7     2013      7 CDU             0.454
 8     2013      8 CDU             0.454
 9     2013      9 CDU             0.459
10     2013     10 CDU             0.452
# ... with 289 more rows

[[5]]
# A tibble: 299 x 4
   Election WKR_NR Partei Anteil_Stimmen
      <int>  <int> <chr>           <dbl>
 1     2017      1 CDU             0.400
 2     2017      2 CDU             0.451
 3     2017      3 CDU             0.419
 4     2017      4 CDU             0.427
 5     2017      5 CDU             0.383
 6     2017      6 CDU             0.407
 7     2017      7 CDU             0.397
 8     2017      8 CDU             0.411
 9     2017      9 CSU             0.442
10     2017     10 CDU             0.395
# ... with 289 more rows

Update 7/11/22:

If the raw data are downloaded from the link as provided, read into R (i.e. using read.csv, readr::read_csv, or data.table::fread), and assigned to df, then one can use the following tidy pipeline to get 299 rows per Election:

df %>% 
  rename(WKR_NR = Wahlkreis) %>% 
  filter(WKR_NR<=299) %>% 
  group_by(Election, WKR_NR, Partei) %>%
  summarise(Anteil_Stimmen = Stimmen/Total_Erststimmen, .groups="drop_last") %>% 
  slice_max(Anteil_Stimmen)

If you want to split by Election (i.e. by Year) you can add this to the end of the pipeline:

  ... %>%
  group_by(Election) %>% 
  group_split()

Output:

[[1]]
# A tibble: 299 x 4
   Election WKR_NR Partei Anteil_Stimmen
      <int>  <int> <chr>           <dbl>
 1     2002      1 SPD             0.487
 2     2002      2 CDU             0.443
 3     2002      3 SPD             0.458
 4     2002      4 SPD             0.481
 5     2002      5 SPD             0.537
 6     2002      6 SPD             0.479
 7     2002      7 SPD             0.464
 8     2002      8 SPD             0.467
 9     2002      9 SPD             0.485
10     2002     10 SPD             0.461
# ... with 289 more rows

[[2]]
# A tibble: 299 x 4
   Election WKR_NR Partei Anteil_Stimmen
      <int>  <int> <chr>           <dbl>
 1     2005      1 SPD             0.442
 2     2005      2 CDU             0.479
 3     2005      3 CDU             0.449
 4     2005      4 CDU             0.441
 5     2005      5 SPD             0.507
 6     2005      6 SPD             0.470
 7     2005      7 CDU             0.442
 8     2005      8 CDU             0.439
 9     2005      9 SPD             0.446
10     2005     10 CDU             0.444
# ... with 289 more rows

[[3]]
# A tibble: 299 x 4
   Election WKR_NR Partei Anteil_Stimmen
      <int>  <int> <chr>           <dbl>
 1     2009      1 CDU             0.388
 2     2009      2 CDU             0.432
 3     2009      3 CDU             0.398
 4     2009      4 CDU             0.402
 5     2009      5 SPD             0.383
 6     2009      6 CDU             0.386
 7     2009      7 CDU             0.408
 8     2009      8 CDU             0.398
 9     2009      9 CDU             0.386
10     2009     10 CDU             0.399
# ... with 289 more rows

[[4]]
# A tibble: 299 x 4
   Election WKR_NR Partei Anteil_Stimmen
      <int>  <int> <chr>           <dbl>
 1     2013      1 CDU             0.425
 2     2013      2 CDU             0.498
 3     2013      3 CDU             0.454
 4     2013      4 CDU             0.452
 5     2013      5 SPD             0.430
 6     2013      6 CDU             0.437
 7     2013      7 CDU             0.454
 8     2013      8 CDU             0.454
 9     2013      9 CDU             0.459
10     2013     10 CDU             0.452
# ... with 289 more rows

[[5]]
# A tibble: 299 x 4
   Election WKR_NR Partei Anteil_Stimmen
      <int>  <int> <chr>           <dbl>
 1     2017      1 CDU             0.400
 2     2017      2 CDU             0.451
 3     2017      3 CDU             0.419
 4     2017      4 CDU             0.427
 5     2017      5 CDU             0.383
 6     2017      6 CDU             0.407
 7     2017      7 CDU             0.397
 8     2017      8 CDU             0.411
 9     2017      9 CSU             0.442
10     2017     10 CDU             0.395
# ... with 289 more rows

回复收藏 0 原文

~没有更多了~