可以在groupby中执行多个操作吗?
假设我有以下 DataFrame:
df = pd.DataFrame({
'year': [2015, 2015, 2018, 2018, 2020],
'total': [100, 200, 50, 150, 400],
'tax': [10, 20, 5, 15, 40]})
year total tax
0 2015 100 10
1 2015 200 20
2 2018 50 5
3 2018 150 15
4 2020 400 40
我想按年份对总计和税收列进行求和,并同时获取 size
。
以下代码给出了两列的总和:
df_total_tax = df.groupby('year', as_index=False)[['total', 'tax']].apply(np.sum)
但是,我不知道如何同时包含 size
列。我是否必须执行不同的groupby
,然后使用.size()
,然后将该列附加到df_total_tax
?或者有更简单的方法吗?
最终结果如下所示:
year total tax size
0 2015 300 30 2
1 2018 200 20 2
2 2020 400 40 1
Suppose I have the following DataFrame:
df = pd.DataFrame({
'year': [2015, 2015, 2018, 2018, 2020],
'total': [100, 200, 50, 150, 400],
'tax': [10, 20, 5, 15, 40]})
year total tax
0 2015 100 10
1 2015 200 20
2 2018 50 5
3 2018 150 15
4 2020 400 40
I want to sum up the total and tax columns by year and obtain the size
at the same time.
The following code gives me the sum of the two columns:
df_total_tax = df.groupby('year', as_index=False)[['total', 'tax']].apply(np.sum)
However, I can't figure out how to also include a column for size
at the same time. Must I perform a different groupby
, then use .size()
and then append that column to df_total_tax
? Or is there an easier way?
The end result would look like this:
year total tax size
0 2015 300 30 2
1 2018 200 20 2
2 2020 400 40 1
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

您可以在命名聚合中为每列单独指定聚合函数:
You can specify for each column separately aggregate function in named aggregation: