使用python pandas如何使用分层随机抽样,其中分配了取样的需要
我有一个针对农民组和ID的数据集。我必须使用18个农民中的6名农民使用分层的随机抽样,其中给予采样百分比。
组明智的百分比如下
日期集:
现在,使用采样,我必须选择6个农民,其中6x0.50 = 3个群体中的3个农民:“ M,SC “将选择来自F组,SC组和1组农民的6x0.25 = 2个农民。
这是我到目前为止所拥有的:
df
Out[41]:
Group ID
0 M,SC 1
1 M,SC 2
2 M,SC 3
3 M,SC 4
4 M,SC 5
5 F,SC 6
6 F,SC 7
7 F,SC 8
8 F,SC 9
9 M,ST 10
10 M,ST 11
11 M,ST 12
12 M,ST 13
13 M,ST 14
14 F,ST 15
15 F,ST 16
16 F,ST 17
17 F,ST 18
N=6
df.groupby('Group', group_keys=False).apply(lambda x: x.sample(int(np.rint(N*len(x)/len(df))))).sample(frac=1).reset_index(drop=True)
Out[43]:
Group ID
0 M,ST 14
1 M,SC 3
2 M,ST 10
3 M,SC 2
4 F,ST 15
5 F,SC 7
现在,我一直坚持如何将给定%应用于M,SC Group:50%,F,SC Group:25%,M,M ,ST组:20%和F,ST组5%,上述代码按比例选择n = 6的样本。
I have a data set for Group and IDs of the Farmers. I have to select 6 Farmers out of 18 farmers using Stratified Random Sampling where percentage is given for sampling.
Group wise percentage as below
Date Set:
Now, Using Sampling, I have to select 6 farmers, where 6x0.50=3 farmers from Group :"M,SC", 6x0.25=2 farmers from group F,SC and 1 farmer from Group M,ST will be select.
Here is what I have so far:
df
Out[41]:
Group ID
0 M,SC 1
1 M,SC 2
2 M,SC 3
3 M,SC 4
4 M,SC 5
5 F,SC 6
6 F,SC 7
7 F,SC 8
8 F,SC 9
9 M,ST 10
10 M,ST 11
11 M,ST 12
12 M,ST 13
13 M,ST 14
14 F,ST 15
15 F,ST 16
16 F,ST 17
17 F,ST 18
N=6
df.groupby('Group', group_keys=False).apply(lambda x: x.sample(int(np.rint(N*len(x)/len(df))))).sample(frac=1).reset_index(drop=True)
Out[43]:
Group ID
0 M,ST 14
1 M,SC 3
2 M,ST 10
3 M,SC 2
4 F,ST 15
5 F,SC 7
Now, I am stuck on how to apply the given % in the sampling like for M,SC group:50%, F,SC group:25%, M,ST group:20% and F,ST group 5%, the above code proportionally select sample of N=6.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
按照我用于解决问题的代码
Following code I used for solve by problems