根据 R 中的概率表/数据框分配随机变量

发布于 2025-01-10 22:06:49 字数 727 浏览 0 评论 0原文

我有一个如下所示的概率数据框，名为 ptable：

 unique_id color share 
         1   red  0.3  
         1  blue  0.7  
         2   red  0.4  
         3  blue  0.5

我想根据中的 share 变量随机分配一个 color 变量可能是另一个数据框 join_table 的表，如下所示。

unique_id count
         1    3  
         2    4

我理解sample()，但我不知道如何通过共享unique_id分配概率。我最近的尝试是

join_table %>% 
group_by(unqiue_id) %>% 
mutate(color= sample(ptable$race[unique_id==ptable$unique_id], 
                     size=n(), 
                     prob=ptable$share[nique_id==ptable$unique_id], 
                     replace=TRUE))

任何帮助都会很棒。

原文

I have a probability data frame like below, called ptable:

 unique_id color share 
         1   red  0.3  
         1  blue  0.7  
         2   red  0.4  
         3  blue  0.5

I'd like to randomly assign a color variable based on the share variable in the probably table to another data frame join_table that looks like below.

unique_id count
         1    3  
         2    4

I understand sample() but am stuck on how to assign the probability by the shared unique_id. My latest attempt was

join_table %>% 
group_by(unqiue_id) %>% 
mutate(color= sample(ptable$race[unique_id==ptable$unique_id], 
                     size=n(), 
                     prob=ptable$share[nique_id==ptable$unique_id], 
                     replace=TRUE))

Any help would be great.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

翻身的咸鱼 2025-01-17 22:06:49

代码中有两个拼写错误：

group_by(unqiue_id) 应该是 group_by(unique_id)
并且

prob=ptable$share[nique_id==ptable$unique_id] 应为 prob=ptable$share[unique_id==ptable$unique_id]。

这应该有效：

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
ptable <- tibble::tribble(
  ~unique_id, ~color, ~share,
1,   "red",  0.3,  
1,  "blue",  0.7, 
2,   "red",  0.4,
3,  "blue",  0.5)

join_table <- tibble::tribble(
  ~unique_id, ~count,
  1,    3,  
  2,    4)

join_table %>% 
  group_by(unique_id) %>% 
  mutate(color= sample(ptable$color[unique_id==ptable$unique_id], 
                       size=n(), 
                       prob=ptable$share[unique_id==ptable$unique_id], 
                       replace=TRUE))
#> # A tibble: 2 × 3
#> # Groups:   unique_id [2]
#>   unique_id count color
#>       <dbl> <dbl> <chr>
#> 1         1     3 blue 
#> 2         2     4 red

^{由 reprex 包 (v2.0.1)于 2022 年 3 月 1 日创建< /sup>}

There were two typos in the code:

group_by(unqiue_id) should be group_by(unique_id)
and

prob=ptable$share[nique_id==ptable$unique_id] should be prob=ptable$share[unique_id==ptable$unique_id].

This should work:

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
ptable <- tibble::tribble(
  ~unique_id, ~color, ~share,
1,   "red",  0.3,  
1,  "blue",  0.7, 
2,   "red",  0.4,
3,  "blue",  0.5)

join_table <- tibble::tribble(
  ~unique_id, ~count,
  1,    3,  
  2,    4)

join_table %>% 
  group_by(unique_id) %>% 
  mutate(color= sample(ptable$color[unique_id==ptable$unique_id], 
                       size=n(), 
                       prob=ptable$share[unique_id==ptable$unique_id], 
                       replace=TRUE))
#> # A tibble: 2 × 3
#> # Groups:   unique_id [2]
#>   unique_id count color
#>       <dbl> <dbl> <chr>
#> 1         1     3 blue 
#> 2         2     4 red

^{Created on 2022-03-01 by the reprex package (v2.0.1)}

回复收藏 0 原文

~没有更多了~