使用python pandas如何使用分层随机抽样,其中分配了取样的需要

发布于 2025-01-31 10:44:50 字数 1323 浏览 3 评论 0原文

我有一个针对农民组和ID的数据集。我必须使用18个农民中的6名农民使用分层的随机抽样,其中给予采样百分比。

组明智的百分比如下

“在此处输入图像描述”

日期集:

“在此处输入图像说明”

现在,使用采样,我必须选择6个农民,其中6x0.50 = 3个群体中的3个农民:“ M,SC “将选择来自F组,SC组和1组农民的6x0.25 = 2个农民。

这是我到目前为止所拥有的:

df
Out[41]: 
   Group  ID
0   M,SC   1
1   M,SC   2
2   M,SC   3
3   M,SC   4
4   M,SC   5
5   F,SC   6
6   F,SC   7
7   F,SC   8
8   F,SC   9
9   M,ST  10
10  M,ST  11
11  M,ST  12
12  M,ST  13
13  M,ST  14
14  F,ST  15
15  F,ST  16
16  F,ST  17
17  F,ST  18

N=6

df.groupby('Group', group_keys=False).apply(lambda x: x.sample(int(np.rint(N*len(x)/len(df))))).sample(frac=1).reset_index(drop=True)
Out[43]: 
  Group  ID
0  M,ST  14
1  M,SC   3
2  M,ST  10
3  M,SC   2
4  F,ST  15
5  F,SC   7

现在,我一直坚持如何将给定%应用于M,SC Group:50%,F,SC Group:25%,M,M ,ST组:20%和F,ST组5%,上述代码按比例选择n = 6的样本。

I have a data set for Group and IDs of the Farmers. I have to select 6 Farmers out of 18 farmers using Stratified Random Sampling where percentage is given for sampling.

Group wise percentage as below

enter image description here

Date Set:

enter image description here

Now, Using Sampling, I have to select 6 farmers, where 6x0.50=3 farmers from Group :"M,SC", 6x0.25=2 farmers from group F,SC and 1 farmer from Group M,ST will be select.

enter image description here

Here is what I have so far:

df
Out[41]: 
   Group  ID
0   M,SC   1
1   M,SC   2
2   M,SC   3
3   M,SC   4
4   M,SC   5
5   F,SC   6
6   F,SC   7
7   F,SC   8
8   F,SC   9
9   M,ST  10
10  M,ST  11
11  M,ST  12
12  M,ST  13
13  M,ST  14
14  F,ST  15
15  F,ST  16
16  F,ST  17
17  F,ST  18

N=6

df.groupby('Group', group_keys=False).apply(lambda x: x.sample(int(np.rint(N*len(x)/len(df))))).sample(frac=1).reset_index(drop=True)
Out[43]: 
  Group  ID
0  M,ST  14
1  M,SC   3
2  M,ST  10
3  M,SC   2
4  F,ST  15
5  F,SC   7

Now, I am stuck on how to apply the given % in the sampling like for M,SC group:50%, F,SC group:25%, M,ST group:20% and F,ST group 5%, the above code proportionally select sample of N=6.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

眼眸印温柔 2025-02-07 10:44:50

按照我用于解决问题的代码

import pandas as pd
import numpy as np
    
df['Proportion'] = df['Group'].replace(['M,SC','F,SC','M,ST','F,ST'],['0.5','0.25','0.2','0.05'])

df['Proportion'] = df['Proportion'].astype('float')

df['Sample']=round(df['Proportion']*6,0)

df['Selected Farmers_ID'] = df['Sample'].apply(np.ceil).astype(int)

df['Selected Farmers_ID'] = df.groupby('Group').apply(lambda df: df['ID'].sample(df['Selected Farmers_ID'].iat[0])).reset_index(level=0)['ID']

df['Selected Farmers_ID'] = df['Selected Farmers_ID'].fillna('')

df['Selected Farmers_ID'].replace('', pd.np.nan, inplace=True)

df.dropna(subset=['Selected Farmers_ID'], inplace=True)

df
Out[11]: 
   Group  ID  Proportion  Sample  Selected Farmers_ID
1   M,SC   2        0.50     3.0                  2.0
3   M,SC   4        0.50     3.0                  4.0
4   M,SC   5        0.50     3.0                  5.0
5   F,SC   6        0.25     2.0                  6.0
8   F,SC   9        0.25     2.0                  9.0
12  M,ST  13        0.20     1.0                 13.0

Following code I used for solve by problems

import pandas as pd
import numpy as np
    
df['Proportion'] = df['Group'].replace(['M,SC','F,SC','M,ST','F,ST'],['0.5','0.25','0.2','0.05'])

df['Proportion'] = df['Proportion'].astype('float')

df['Sample']=round(df['Proportion']*6,0)

df['Selected Farmers_ID'] = df['Sample'].apply(np.ceil).astype(int)

df['Selected Farmers_ID'] = df.groupby('Group').apply(lambda df: df['ID'].sample(df['Selected Farmers_ID'].iat[0])).reset_index(level=0)['ID']

df['Selected Farmers_ID'] = df['Selected Farmers_ID'].fillna('')

df['Selected Farmers_ID'].replace('', pd.np.nan, inplace=True)

df.dropna(subset=['Selected Farmers_ID'], inplace=True)

df
Out[11]: 
   Group  ID  Proportion  Sample  Selected Farmers_ID
1   M,SC   2        0.50     3.0                  2.0
3   M,SC   4        0.50     3.0                  4.0
4   M,SC   5        0.50     3.0                  5.0
5   F,SC   6        0.25     2.0                  6.0
8   F,SC   9        0.25     2.0                  9.0
12  M,ST  13        0.20     1.0                 13.0
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文