在r group_by和dplyr中的循环中

发布于 2025-02-08 10:06:27 字数 1391 浏览 2 评论 0 原文

我有以下数据框架,

my_df <- data.frame(Municipality=c('a', 'a', 'a', 'a', 'b', 'b', 'c','c','c','d','d'),
                    state=c('ac', 'ac', 'ac', 'ac', 'pb', 'pb', 'am','am','am','pi','pi'),
                    votes=c(541, 463, 246, 49, 2443, 2287, 1035,3530,9999,666,3809))

我想计算每个“市政当局”的投票份额以及每一个与州最高投票股有关的差异(“保证金胜利”)。我尝试了以下代码,

actual_df<-my_df %>%
  group_by(Municipality,state) %>% 
  mutate(
    share_vote = votes / sum(votes), # calculate vote shares
    margin_victory = (max(share_vote)-(max( share_vote[share_vote!=max(share_vote)]))),
  ) %>% 
  ungroup()

此代码是按预期正确计算的共享投票。但是,只有当您有两个市政当局时,“保证金胜利”才是正确的。以下是我想尝试

desired_df <- data.frame(Municipality=c('a', 'a', 'a', 'a', 'b', 'b', 'c','c','c','d','d'),
                    state=c('ac', 'ac', 'ac', 'ac', 'pb', 'pb', 'am','am','am','pi','pi'),
                    votes=c(541, 463, 246, 49, 2443, 2287, 1035,3530,9999,666,3809),
                    margin_victory= c(0.06004619,-0.06004619,0.2270978, 0.3787529,
                                      0.03298097,-0.03298097,
                                      -0.6154902,-0.44417742,0.44417742,
                                      -0.70234637,0.70234637))

margin_victory =(i in share_vote){max(share_vote)-share_vote}的“实际df”代码中的“保证金胜利”,,但是没有成功。

I have the following dataframe

my_df <- data.frame(Municipality=c('a', 'a', 'a', 'a', 'b', 'b', 'c','c','c','d','d'),
                    state=c('ac', 'ac', 'ac', 'ac', 'pb', 'pb', 'am','am','am','pi','pi'),
                    votes=c(541, 463, 246, 49, 2443, 2287, 1035,3530,9999,666,3809))

I would like to calculate the vote shares of each "Municipality" and the difference ("margin victory") of each one of them in relation to the highest vote shares by state. I tried the following code

actual_df<-my_df %>%
  group_by(Municipality,state) %>% 
  mutate(
    share_vote = votes / sum(votes), # calculate vote shares
    margin_victory = (max(share_vote)-(max( share_vote[share_vote!=max(share_vote)]))),
  ) %>% 
  ungroup()

This code is calculating share vote correctly as expected. However, the "margin victory" is correct only when you have two Municipalities. The below is what I would like to have

desired_df <- data.frame(Municipality=c('a', 'a', 'a', 'a', 'b', 'b', 'c','c','c','d','d'),
                    state=c('ac', 'ac', 'ac', 'ac', 'pb', 'pb', 'am','am','am','pi','pi'),
                    votes=c(541, 463, 246, 49, 2443, 2287, 1035,3530,9999,666,3809),
                    margin_victory= c(0.06004619,-0.06004619,0.2270978, 0.3787529,
                                      0.03298097,-0.03298097,
                                      -0.6154902,-0.44417742,0.44417742,
                                      -0.70234637,0.70234637))

I tried to replace "margin victory" in the "actual df" code with margin_victory = for (i in share_vote ) {max(share_vote)-share_vote}, but without sucess.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

匿名的好友 2025-02-15 10:06:27

您确定所需结果的迹象吗?如果没有,我会建议以下内容:

library(tidyverse)

my_df %>% group_by(Municipality, state) %>%
  mutate(
    share_vote = votes / sum(votes),
    mar = ifelse(votes == max(votes),
                 votes - max(votes[votes != max(votes)]),
                 (votes - max(votes))) / sum(votes)) %>%
  ungroup()
#> # A tibble: 11 × 5
#>    Municipality state votes share_vote     mar
#>    <chr>        <chr> <dbl>      <dbl>   <dbl>
#>  1 a            ac      541     0.416   0.0600
#>  2 a            ac      463     0.356  -0.0600
#>  3 a            ac      246     0.189  -0.227 
#>  4 a            ac       49     0.0377 -0.379 
#>  5 b            pb     2443     0.516   0.0330
#>  6 b            pb     2287     0.484  -0.0330
#>  7 c            am     1035     0.0711 -0.615 
#>  8 c            am     3530     0.242  -0.444 
#>  9 c            am     9999     0.687   0.444 
#> 10 d            pi      666     0.149  -0.702 
#> 11 d            pi     3809     0.851   0.702

Are you sure about the signs of your desired result? If not, I would have suggested the following:

library(tidyverse)

my_df %>% group_by(Municipality, state) %>%
  mutate(
    share_vote = votes / sum(votes),
    mar = ifelse(votes == max(votes),
                 votes - max(votes[votes != max(votes)]),
                 (votes - max(votes))) / sum(votes)) %>%
  ungroup()
#> # A tibble: 11 × 5
#>    Municipality state votes share_vote     mar
#>    <chr>        <chr> <dbl>      <dbl>   <dbl>
#>  1 a            ac      541     0.416   0.0600
#>  2 a            ac      463     0.356  -0.0600
#>  3 a            ac      246     0.189  -0.227 
#>  4 a            ac       49     0.0377 -0.379 
#>  5 b            pb     2443     0.516   0.0330
#>  6 b            pb     2287     0.484  -0.0330
#>  7 c            am     1035     0.0711 -0.615 
#>  8 c            am     3530     0.242  -0.444 
#>  9 c            am     9999     0.687   0.444 
#> 10 d            pi      666     0.149  -0.702 
#> 11 d            pi     3809     0.851   0.702

Created on 2022-06-17 by the reprex package (v2.0.1)

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文