删除python中元组数组中的重复值

发布于 2025-01-15 06:55:10 字数 884 浏览 2 评论 0原文

我有一个购买产品的聚会。每次客户购买产品时,都会生成一个具有相同方编号的新行。

我已按派对编号对产品进行了分组,现在我被困在一个列中,其中包含元组数组

派对编号产品
1(a, a, a, a, b, c)
2(a, d, a, a)
3(a, a, b, b, b)

我无法找到如何从产品列的每一行中删除所有重复项。

groupby 的代码:

pf = prod.groupby(['Party Nbr'])['Product name'].apply(tuple).reset_index().rename(columns= {'Product name': 'Product'})

pf['Product'] = tuple(set(pf['Product']))


ValueError: Length of values (4663) does not match length of index (32539)

有人能帮助我吗?

I have a party whom purchases products. Every time the customer purchases a product, a new row is generated with the same party number.

I have grouped the products on party number and I am now stuck with a column which has arrays of tuples in it

Party NbrProduct
1(a, a, a, a, b, c)
2(a, d, a, a)
3(a, a, b, b, b)

I cant find how I can remove all duplicates from each row of the product column.

Code for the groupby:

pf = prod.groupby(['Party Nbr'])['Product name'].apply(tuple).reset_index().rename(columns= {'Product name': 'Product'})

pf['Product'] = tuple(set(pf['Product']))


ValueError: Length of values (4663) does not match length of index (32539)

Someone able to help me?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

呆橘 2025-01-22 06:55:10

假设您正在使用 pandas,我将您的表重新创建到数据框中,并展示如何进行转换。

In [11]: df = pd.DataFrame({
              "party": [1, 2, 3], 
              "product": [
                  ("a", "a", "a", "a", "b", "c"),
                  ("a", "d", "a", "a"),
                  ("a", "a", "b", "b", "b")]})

In [12]: df
Out[12]: 
   party             product
0      1  (a, a, a, a, b, c)
1      2        (a, d, a, a)
2      3     (a, a, b, b, b)

In [13]: df["product"] = df["product"].apply(set).apply(tuple)

In [14]: df
Out[17]: 
   party    product
0      1  (c, b, a)
1      2     (a, d)
2      3     (b, a)

注意:如评论中所述,产品的顺序不会保留,您想保留顺序,可以使用自定义函数代替链接 set & ; 元组

Assuming, you are using pandas, I recreated your table into a dataframe, and show how you could do the transform.

In [11]: df = pd.DataFrame({
              "party": [1, 2, 3], 
              "product": [
                  ("a", "a", "a", "a", "b", "c"),
                  ("a", "d", "a", "a"),
                  ("a", "a", "b", "b", "b")]})

In [12]: df
Out[12]: 
   party             product
0      1  (a, a, a, a, b, c)
1      2        (a, d, a, a)
2      3     (a, a, b, b, b)

In [13]: df["product"] = df["product"].apply(set).apply(tuple)

In [14]: df
Out[17]: 
   party    product
0      1  (c, b, a)
1      2     (a, d)
2      3     (b, a)

Note: as mentioned in the comments, the order of the products is not preserved, you want to preserve the order, you can use a custom function in place of chaining set & tuple.

寒冷纷飞旳雪 2025-01-22 06:55:10

要从tuple中删除重复项,您可以使用set类型来自动删除重复项。您可以通过简单的调用来完成此操作:

In [1]: a=(1,2,2,1,1,1,1,3)

In [2]: tuple(set(a))
Out[2]: (1, 2, 3)

To remove the duplicate from a tuple you can use the set type that will automatically remove duplicates. You can do it in a simple call :

In [1]: a=(1,2,2,1,1,1,1,3)

In [2]: tuple(set(a))
Out[2]: (1, 2, 3)
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文