pandas hub_tables 不适用于日期数据(没有要聚合的数字类型)
我有以下数据框:
index id code data date
0 AZ234 B213 apple 2020-09-01 <- duplicate id, code, data
1 AZ234 B213 apple 2022-02-02 <- duplicate id, code, data
2 AZ234 B213 banana 2020-07-01
3 AZ234 B213 orange 2020-05-11
4 AL612 B309 apple 2020-12-05
5 AL612 B309 banana 2020-07-21
6 AL612 B309 orange 2020-09-21
...
我想创建数据透视表来获取下表:
id code apple banana orange
AZ234 B213 2020-09-01 2020-07-01 2020-05-11
AL612 B309 2020-12-05 2020-07-21 2020-09-21
...
我尝试使用数据透视表(pandas)来执行此操作:
pd.pivot_table(df, values='date', index=['id','code'],
columns=['data'])
但我收到此错误:
数据错误:没有要聚合的数字类型
我已阅读这篇文章,但它似乎有点不同,因为我不想更改列,而且当我尝试使用代码和 id 设置_index 时出现错误(“ ValueError:索引包含重复条目,无法重塑”)。
我的目标是创建以日期作为表值的数据透视表。
I have the following dataframe:
index id code data date
0 AZ234 B213 apple 2020-09-01 <- duplicate id, code, data
1 AZ234 B213 apple 2022-02-02 <- duplicate id, code, data
2 AZ234 B213 banana 2020-07-01
3 AZ234 B213 orange 2020-05-11
4 AL612 B309 apple 2020-12-05
5 AL612 B309 banana 2020-07-21
6 AL612 B309 orange 2020-09-21
...
I want to create pivot table to get the following table:
id code apple banana orange
AZ234 B213 2020-09-01 2020-07-01 2020-05-11
AL612 B309 2020-12-05 2020-07-21 2020-09-21
...
I have tried to do this using pivot_table (pandas):
pd.pivot_table(df, values='date', index=['id','code'],
columns=['data'])
but I get this error:
DataError: No numeric types to aggregate
I have read this post but it seems to be a bit different as I don't want to change the columns and also I got error when I tried to set_index with code and id ( "
ValueError: Index contains duplicate entries, cannot reshape").
My goal is to create pivot table with dates as values of the table.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
每个
id、date、data
都有重复项,因此有必要添加一些聚合函数:如果有日期时间:
如果有字符串:
There are duplicates per
id, date, data
so is necessary add some aggregate function:If there are datetimes:
If there are strings: