查找组中的最大值,并返回另一列的值
我的目标是按“分组”列对 df 进行分组,然后找到最大销售额和相应的 SKU。有很多资料解释了如何进行groupby
并查找最大销售额,但是添加这个额外的 SKU 列是我无法找到任何资料来解释我做错了什么的地方。
如下所示,我已成功按“分组”列进行分组,并显示每组的最大销售额,但 SKU 不正确。第 1 组销量最高的 SKU 是“E2-MKEP”,我的结果是“V4-DE5U”。
import pandas as pd
list1 = [
['1', 'V4-DE5U', 956.64],
['1', 'DH-Q9OY', 642.43],
['1', 'E2-MKEP', 1071.6],
['2', 'WL-NOLZ', 389.06],
['2', 'JF-4E3C', 162.69],
['3', 'N9-DABP', 618.96],
['3', 'OO-JBHE', 1451.19],
]
cols = ['Grouping', 'SKU', 'Sales']
df = pd.DataFrame(list1, columns = cols)
df1 = df.groupby(['Grouping']).agg(max)[['Sales', 'SKU']]
print(df1)
结果:
Sales SKU
Grouping
1 1071.60 V4-DE5U
2 389.06 WL-NOLZ
3 1451.19 OO-JBHE
My goal is to group the df
by the 'Grouping' column, and then find the max sales and the corresponding SKU. There a lot of sources that explain how to do the groupby
and find max sales, but adding this additional SKU column is where I am unable to find any sources that explain what I am doing wrong.
As you can see below, I am successfully grouping by the 'Grouping' column, and displaying the max Sales per group, but the SKU is incorrect. The SKU with the highest sales for group 1 is 'E2-MKEP', and my result is 'V4-DE5U'.
import pandas as pd
list1 = [
['1', 'V4-DE5U', 956.64],
['1', 'DH-Q9OY', 642.43],
['1', 'E2-MKEP', 1071.6],
['2', 'WL-NOLZ', 389.06],
['2', 'JF-4E3C', 162.69],
['3', 'N9-DABP', 618.96],
['3', 'OO-JBHE', 1451.19],
]
cols = ['Grouping', 'SKU', 'Sales']
df = pd.DataFrame(list1, columns = cols)
df1 = df.groupby(['Grouping']).agg(max)[['Sales', 'SKU']]
print(df1)
result:
Sales SKU
Grouping
1 1071.60 V4-DE5U
2 389.06 WL-NOLZ
3 1451.19 OO-JBHE
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
您可以转换
max
并创建布尔掩码和过滤器df
:输出:
如果您还需要“分组”列,则可以使用
msk
> 在df
上而不选择列:输出:
You could transform
max
and create a boolean mask and filterdf
:Output:
If you want the "Grouping" column as well, you could use
msk
ondf
without selecting columns:Output:
您可以对“销售额”值进行排序,然后删除“分组”重复项
You can sort the "Sales" values and then drop the "Grouping" duplicates