熊猫如何在某些条件下正确进行分组

发布于 2025-01-09 04:03:54 字数 1965 浏览 0 评论 0原文

我在尝试在pandas中进行分组时遇到了问题,我的数据是这个表,直到“sum”系列,我想要的输出是某种分组依据,它为我提供了这些系列的结果:desired_clientgroup和DesiredGroup_out_sum/avg/max。 例如,数字“104,23”是客户组 1 的总和(我不知道如何生成该组 1 甚至其总和)。

df_indexclient_items价格 数量总和desired_clientgroupDesiredGroup_output_sum
1110,9221,81104,23
228,5542,51
335,75317,251
442,8812,881
559,9219,81
612,248,8232,92
723,55310,652
834,49313,472
918,2216 ,4344,79
1029,19218,383
1136,6716,673
1243,3413,343
13115,99347,974162,65
14219,9599,54
1537,59215,184

对此有什么想法吗?

I had an issue when trying to group by in pandas, my data is this table until "sum" series, my desired output is some kind of group by that delivers me the results with these series: desired_clientgroup and DesiredGroup_out_sum/avg/max.
For example the number "104,23" is the sum over the clientgroup 1 (which I dont know how to generate this group 1 or even its sum).

df_indexclient_itemspriceqtysumdesired_clientgroupDesiredGroup_output_sum
1110,9221,81104,23
228,5542,51
335,75317,251
442,8812,881
559,9219,81
612,248,8232,92
723,55310,652
834,49313,472
918,2216,4344,79
1029,19218,383
1136,6716,673
1243,3413,343
13115,99347,974162,65
14219,9599,54
1537,59215,184

Some thoughts over this?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

望笑 2025-01-16 04:03:54

IIUC,您可以使用:

# start groups on 1
mask = df['client_items'].eq(1)
df['clientgroup'] = mask.cumsum()

# get the sum per group
# assign result only on first group row
df.loc[mask, 'output_sum'] = (df.groupby('clientgroup')
                              ['sum'].transform('sum')
                              )

输出:

    df_index  client_items  price  qty    sum  clientgroup  output_sum
0          1             1  10.90    2  21.80            1      104.23
1          2             2   8.50    5  42.50            1         NaN
2          3             3   5.75    3  17.25            1         NaN
3          4             4   2.88    1   2.88            1         NaN
4          5             5   9.90    2  19.80            1         NaN
5          6             1   2.20    4   8.80            2       32.92
6          7             2   3.55    3  10.65            2         NaN
7          8             3   4.49    3  13.47            2         NaN
8          9             1   8.20    2  16.40            3       44.79
9         10             2   9.19    2  18.38            3         NaN
10        11             3   6.67    1   6.67            3         NaN
11        12             4   3.34    1   3.34            3         NaN
12        13             1  15.99    3  47.97            4      162.65
13        14             2  19.90    5  99.50            4         NaN
14        15             3   7.59    2  15.18            4         NaN

IIUC, you could use:

# start groups on 1
mask = df['client_items'].eq(1)
df['clientgroup'] = mask.cumsum()

# get the sum per group
# assign result only on first group row
df.loc[mask, 'output_sum'] = (df.groupby('clientgroup')
                              ['sum'].transform('sum')
                              )

Output:

    df_index  client_items  price  qty    sum  clientgroup  output_sum
0          1             1  10.90    2  21.80            1      104.23
1          2             2   8.50    5  42.50            1         NaN
2          3             3   5.75    3  17.25            1         NaN
3          4             4   2.88    1   2.88            1         NaN
4          5             5   9.90    2  19.80            1         NaN
5          6             1   2.20    4   8.80            2       32.92
6          7             2   3.55    3  10.65            2         NaN
7          8             3   4.49    3  13.47            2         NaN
8          9             1   8.20    2  16.40            3       44.79
9         10             2   9.19    2  18.38            3         NaN
10        11             3   6.67    1   6.67            3         NaN
11        12             4   3.34    1   3.34            3         NaN
12        13             1  15.99    3  47.97            4      162.65
13        14             2  19.90    5  99.50            4         NaN
14        15             3   7.59    2  15.18            4         NaN
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文