Pandas Pivot表 - 在多索引表中添加小计
我有一张构造的数据表:
名字 | 卡 | 付款ID | 约翰 |
---|---|---|---|
·杜伊 | T077 | 7312637 | 54 |
John Doe | T077 | 1323131 | 34 |
Jane Doe | S044 | 1231321 | 13 |
John Doe | J544 463444 | 4634564 | 53 |
我想实现的输出,我想实现一个pivot桌子类似格式:
姓名 | 交易 | 总和 |
---|---|---|
约翰·杜(John Doe) | 3 | 141 |
--- T077 | 2 | 88 |
--- J544 | 1 | 53 |
JANE DOE | 1 | 13 |
---------- S044 | 1 | 13 |
请记住:
- 付款ID唯一标识交易(每条线中的每一行)表)
- 每个名称都可以使用我使用pandas pivot_table尝试的一张或多张卡的一项或多件交易
,但是我找不到按照我想要的构造数据的方法(包括每个名称的小计),我只能按名称和卡片进行分组 。
pd.pivot_table(df, values='Amount', index=['Name','Card'], aggfunc=(np.sum, len))
对桌子上的格式不佳,对不起,我的降价技能非常有限
有什么帮助吗?
I have a table of data structured as it follows:
Name | Card | Payment ID | Amount |
---|---|---|---|
John Doe | t077 | 7312637 | 54 |
John Doe | t077 | 1323131 | 34 |
Jane Doe | s044 | 1231321 | 13 |
John Doe | j544 | 4634564 | 53 |
The output I want to achieve is to have a pivot table with a similar format:
Name | Number of Transactions | Sum |
---|---|---|
John Doe | 3 | 141 |
--- t077 | 2 | 88 |
--- j544 | 1 | 53 |
Jane Doe | 1 | 13 |
--- s044 | 1 | 13 |
Please keep in mind that:
- Payment ID uniquely identifies the transaction (every line in the table)
- Every Name can have one or multiple transactions with one or multiple cards
I tried using pandas pivot_table, however I cannot find a way to structure the data as I want (including subtotals per Name), I can only group by Name and Card using
pd.pivot_table(df, values='Amount', index=['Name','Card'], aggfunc=(np.sum, len))
Sorry for the poor formatting on the table, my markdown skills are quite limited.
Any help on this?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
枢轴表是一种很好的方法,尝试:
输出:
次数参考: link。
Pivot table is a good approach, try:
Output:
Subtotal reference: link.
请确保使用
pivot_table
使用margins = true
,然后使用以下功能:使用示例:
注意:我将以下内容添加到两个
>上的代码中table =
lines :()
我正在修改它的过程,因为我有一个
dcategorical
排序,该排序位于原始df_pivot
中,但在返回的表中没有。任何帮助都很棒!Be sure to create a
pivot_table
withmargins=True
and then use the function below:Use example:
Note: I added the following into the code on both of the
table =
lines:(source)
I am in the process of modifying this a little bit, as I have a
dCategorical
sort that is present in the originaldf_pivot
but not in the returned table. Any help would be great!