删除python中元组数组中的重复值
我有一个购买产品的聚会。每次客户购买产品时,都会生成一个具有相同方编号的新行。
我已按派对编号对产品进行了分组,现在我被困在一个列中,其中包含元组数组
派对编号 | 产品 |
---|---|
1 | (a, a, a, a, b, c) |
2 | (a, d, a, a) |
3 | (a, a, b, b, b) |
我无法找到如何从产品列的每一行中删除所有重复项。
groupby 的代码:
pf = prod.groupby(['Party Nbr'])['Product name'].apply(tuple).reset_index().rename(columns= {'Product name': 'Product'})
pf['Product'] = tuple(set(pf['Product']))
ValueError: Length of values (4663) does not match length of index (32539)
有人能帮助我吗?
I have a party whom purchases products. Every time the customer purchases a product, a new row is generated with the same party number.
I have grouped the products on party number and I am now stuck with a column which has arrays of tuples in it
Party Nbr | Product |
---|---|
1 | (a, a, a, a, b, c) |
2 | (a, d, a, a) |
3 | (a, a, b, b, b) |
I cant find how I can remove all duplicates from each row of the product column.
Code for the groupby:
pf = prod.groupby(['Party Nbr'])['Product name'].apply(tuple).reset_index().rename(columns= {'Product name': 'Product'})
pf['Product'] = tuple(set(pf['Product']))
ValueError: Length of values (4663) does not match length of index (32539)
Someone able to help me?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
假设您正在使用 pandas,我将您的表重新创建到数据框中,并展示如何进行转换。
注意:如评论中所述,产品的顺序不会保留,您想保留顺序,可以使用自定义函数代替链接
set
& ;元组
。Assuming, you are using
pandas
, I recreated your table into a dataframe, and show how you could do the transform.Note: as mentioned in the comments, the order of the products is not preserved, you want to preserve the order, you can use a custom function in place of chaining
set
&tuple
.要从
tuple
中删除重复项,您可以使用set
类型来自动删除重复项。您可以通过简单的调用来完成此操作:To remove the duplicate from a
tuple
you can use theset
type that will automatically remove duplicates. You can do it in a simple call :