如何在R中创建与向量成比例的随机样本?
假设以下数据框:
df <- data.frame(id = 1:6, value=c(10,20,10,20,30,10))
df
id value
1 1 10
2 2 20
3 3 10
4 4 20
5 5 30
6 6 10
我想将每个人随机分配到三个组(A、B、C)之一。我想达到 A 组 30%、B 组 50%、C 组 20% 的给定比例。但我想根据值列进行此分配。换句话说,我想要实现类似以下的内容:
id value group
1 1 10 A
2 2 20 A
3 3 10 C
4 4 20 B
5 5 30 B
6 6 10 C
或者...
id value group
1 1 10 A
2 2 20 B
3 3 10 A
4 4 20 C
5 5 30 B
6 6 10 A
当然,在这个例子中,这些都是完美的解决方案。但随机分配应尽可能接近给定的比例。因此另一个例子如下:
df <- data.frame(id = 1:6, value=c(112,56,53,13,80,120))
df
id value
1 1 112
2 2 56
3 3 53
4 4 13
5 5 80
6 6 120
一种可能的分配可能是:
id value group
1 1 112 B
2 2 56 A
3 3 53 C
4 4 13 C
5 5 80 A
6 6 120 B
在这种情况下,分配不会是完美的,但接近所需的比例(A 组:31.3%,B 组:53.4%,C 组:15.2%) 。
在R中有什么办法可以实现这一点吗?谢谢!
Assume the following data frame:
df <- data.frame(id = 1:6, value=c(10,20,10,20,30,10))
df
id value
1 1 10
2 2 20
3 3 10
4 4 20
5 5 30
6 6 10
I want to randomly assign every individual to one of three groups (A,B,C). I want to achieve given proportions of 30% to be in group A, 50% to be in group B, 20% to be in group C. But I want to do this assignment based on the value column. In other words, I want to achieve something like the following:
id value group
1 1 10 A
2 2 20 A
3 3 10 C
4 4 20 B
5 5 30 B
6 6 10 C
or...
id value group
1 1 10 A
2 2 20 B
3 3 10 A
4 4 20 C
5 5 30 B
6 6 10 A
Of course, in this example, these are perfect solutions. But the random assignment should approach a group assignment as close to the given proportions as possible. So another example would be the following:
df <- data.frame(id = 1:6, value=c(112,56,53,13,80,120))
df
id value
1 1 112
2 2 56
3 3 53
4 4 13
5 5 80
6 6 120
One possible assignment could be:
id value group
1 1 112 B
2 2 56 A
3 3 53 C
4 4 13 C
5 5 80 A
6 6 120 B
In this case, the assignment wouldn't be perfect but close to the desired proportions (group A: 31.3%, group B: 53.4%, group C: 15.2%).
Is there any way to achieve this in R? Thanks!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我理解你的目标是,在小组分配之后,你希望
sum(value[group == "A"]) / sum(value)
大约等于0.3
,同样与"B"
(0.5
) 和"C"
(0.2
)。如果是这种情况,您所要做的就是分配具有这些概率权重的组,而无需执行任何特殊操作来考虑value
。作为随机化的自然结果,value
的总和(平均而言)将按照您的意愿进行调整。查看:I understand your goal as, after group assignment, you want
sum(value[group == "A"]) / sum(value)
to approximately equal0.3
, and likewise with"B"
(0.5
) and"C"
(0.2
). If that's the case, all you have to do is assign groups with those probability weights, without doing anything special to takevalue
into account at all. The sums ofvalue
will (on average) shake out as you want as a natural consequence of the randomization. Look: