如何为每个客户使用大熊猫获得产品对

发布于 2025-02-11 19:13:30 字数 1168 浏览 1 评论 0原文

我想获得产品对,其中连续的产品由客户查看,其中第二个产品是第一产品建议的一部分。

例如,客户1查看的产品P1和产品的建议是R1-R2-R3-R4(4种不同的产品)和时间戳T1。

客户1查看的产品P2和产品的建议(P2)为R11-R12-R13-R14和Time Stamp T2。

我们需要C1,P1,P2,其中P2是P1 IE R1/R2/R3/R4

Customer_IDProduct_id(查看)推荐_itemsTime_stamp
1123111-098-066-066-55510-05-2020 10-05-2020
1 111 213-213--213--213--213-213-213-213-213-211 213--2020 1123111-098-098-066-555 012-12210-05-2020
2213321-98712-1212-3434-454510-05-2020
298798711-456710-05-2020

,我们可以从上面的表格中看到客户1(他可以看到1(他) 123)在他的建议中,项目111产品在那里,他还看到了该产品(第二行产品ID)。但是对于客户2而言,这没有发生。结果表看起来像这样:

customer_idproduct_id_1(查看)product_id_2(查看)
1123111

任何人都可以提供逻辑或python代码。建议的以下代码甚至是值匹配的部分匹配字符串,但是我们需要匹配整个值。

I want to get product pairs where consecutive products viewed by the customer in which the second product is part of the recommendations of first product.

e.g. Customer 1 viewed Product P1 and the recommendations for the product are R1-R2-R3-R4 (4 different products) and time stamp t1.

Customer 1 viewed Product P2 and the recommendations for the product(p2) are R11-R12-R13-R14 and time stamp t2.

We need the C1, P1,P2 where P2 is one of the recommendation of P1 i.e. R1/R2/R3/R4

customer_idproduct_id(viewed)recommendation_itemstime_stamp
1123111-098-066-55510-05-2020
1111213-012-12210-05-2020
2213321-98712-1212-3434-454510-05-2020
298798711-456710-05-2020

As we can see from the above table for customer 1, he viewed product(123) and in his recommendations item 111 product was there and he also viewed that product(second row product id). But for customer 2 it is not happened. The result table looks like this:

customer_idproduct_id_1(viewed)product_id_2(viewed)
1123111

Can anybody provide the logic or python code. The following code suggested is matching the string even the portion of the value matches, but we need to match the entire value.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

梦亿 2025-02-18 19:13:30

iiuc,您可以使用正则表达式来重新匹配建议,然后JOIN

g = df.groupby('customer_id')
regexes = g['product_id(viewed)'].agg(lambda x: '|'.join(x.astype(str)))
matches = (g['recommendation_items']
           .apply(lambda s: s.str.extractall(fr'\b({regexes[s.name]})\b'))
           .droplevel(['customer_id', 'match'])[0]
          )
# 0    111
# 3    987
# Name: 0, dtype: object

out = (df.join(matches.rename('product_id2(viewed)'), how='right')
         [['customer_id', 'product_id(viewed)', 'product_id2(viewed)']]
      )

输出:

   customer_id  product_id(viewed) product_id2(viewed)
0            1                 123                 111
3            2                 987                 987

删除自我建议:

out = (df.join(matches.rename('product_id2(viewed)'), how='right')
         [['customer_id', 'product_id(viewed)', 'product_id2(viewed)']]
         .loc[lambda d: d['product_id(viewed)'].astype(str).ne(d['product_id2(viewed)'])]
      )

输出:输出:

   customer_id  product_id(viewed) product_id2(viewed)
0            1                 123                 111

IIUC, you can use a regex to rextract the matching recommendations, then join:

g = df.groupby('customer_id')
regexes = g['product_id(viewed)'].agg(lambda x: '|'.join(x.astype(str)))
matches = (g['recommendation_items']
           .apply(lambda s: s.str.extractall(fr'\b({regexes[s.name]})\b'))
           .droplevel(['customer_id', 'match'])[0]
          )
# 0    111
# 3    987
# Name: 0, dtype: object

out = (df.join(matches.rename('product_id2(viewed)'), how='right')
         [['customer_id', 'product_id(viewed)', 'product_id2(viewed)']]
      )

output:

   customer_id  product_id(viewed) product_id2(viewed)
0            1                 123                 111
3            2                 987                 987

To remove self recommendation:

out = (df.join(matches.rename('product_id2(viewed)'), how='right')
         [['customer_id', 'product_id(viewed)', 'product_id2(viewed)']]
         .loc[lambda d: d['product_id(viewed)'].astype(str).ne(d['product_id2(viewed)'])]
      )

output:

   customer_id  product_id(viewed) product_id2(viewed)
0            1                 123                 111
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文