基于 pandas 数据透视表不同逻辑的不同 aggfunc

发布于 2025-01-10 06:37:31 字数 1119 浏览 4 评论 0原文

我想将不同的“aggfunc”逻辑应用于 pandas 数据透视表。假设我有以下 df.

df1 = pd.DataFrame({'Country':['Italy', 'Italy', 'Italy', 'Germany','Germany', 'Germany', 'France', 'France'],
                   'City':['Rome','Rome',"Florence",'Berlin', 'Munich', 'Koln', "Paris", "Paris"],
                    'Numbers':[100,200,300,400,500,600,700,800]})

我想计算每个城市的“数字”总和以及基于国家/地区的“数字”平均值。我应该得到以下输出。

我必须使用 pd.pivot。但如果您有更好的解决方案，您也可以提出建议。

你能帮我一下吗？

国家	城市	SUM	MEAN
法国	巴黎	1500	750
德国	柏林	400	500
德国	科隆	600	500
德国	慕尼黑	500	500
意大利	佛罗伦萨	300	200
意大利	罗马	300	200

我尝试使用以下方法，但显然不起作用。

pd.pivot_table(df1, values = 'Numbers', index=['Country', 'City'], aggfunc=[np.sum, np.mean])

原文

I would like to apply different "aggfunc" logics to a pandas pivot table. Lets suppose that I have the below df.

df1 = pd.DataFrame({'Country':['Italy', 'Italy', 'Italy', 'Germany','Germany', 'Germany', 'France', 'France'],
                   'City':['Rome','Rome',"Florence",'Berlin', 'Munich', 'Koln', "Paris", "Paris"],
                    'Numbers':[100,200,300,400,500,600,700,800]})

I would like to calculate the sum of "Numbers" per City and the mean of "Numbers" based on the Country. I should get the below output.

I must use the pd.pivot. But if you have better solutions, you can ALSO suggest that.

Would you be able to help me out?

Country	City	SUM	MEAN
France	Paris	1500	750
Germany	Berlin	400	500
Germany	Köln	600	500
Germany	Munich	500	500
Italy	Florence	300	200
Italy	Rome	300	200

I have tried using the following but it obviously does not work.

pd.pivot_table(df1, values = 'Numbers', index=['Country', 'City'], aggfunc=[np.sum, np.mean])

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

岁月静好 2025-01-17 06:37:31

使用GroupBy.transform

new_df = \
df1.assign(
    SUM = df1.groupby('City', sort=False)['Numbers'].transform('sum'),
    MEAN = df1.groupby('Country', sort=False)['Numbers'].transform('mean')
).drop_duplicates(['Country', 'City']).drop('Numbers', axis=1)

   Country      City   SUM  MEAN
0    Italy      Rome   300   200
1    Italy      Rome   300   200
2    Italy  Florence   300   200
3  Germany    Berlin   400   500
4  Germany    Munich   500   500
5  Germany      Koln   600   500
6   France     Paris  1500   750
7   France     Paris  1500   750

use GroupBy.transform

new_df = \
df1.assign(
    SUM = df1.groupby('City', sort=False)['Numbers'].transform('sum'),
    MEAN = df1.groupby('Country', sort=False)['Numbers'].transform('mean')
).drop_duplicates(['Country', 'City']).drop('Numbers', axis=1)

   Country      City   SUM  MEAN
0    Italy      Rome   300   200
1    Italy      Rome   300   200
2    Italy  Florence   300   200
3  Germany    Berlin   400   500
4  Germany    Munich   500   500
5  Germany      Koln   600   500
6   France     Paris  1500   750
7   France     Paris  1500   750

回复收藏 0 原文

~没有更多了~