创建“查看此内容的人也查看了”列表

发布于 2024-11-08 06:33:58 字数 246 浏览 0 评论 0原文

我正在考虑创建一个“查看此内容的人也查看了”列表,您可以在亚马逊、yelp 和其他在线网站上看到该列表。现在我正在考虑使用“product_id”,“last_viewed_product_id”,“hits”创建一个新表,当用户从product_id = 100的页面转到product_id = 101时,它将使用product_id =创建/更新该表101、last_viewed_product_id=100,并增加“hits”值。是否有更优化且计算强度更低的更好方法?

I'm thinking of creating a 'People who viewed this also viewed' list that you see on amazon, yelp and other online sites. Right now I'm thinking of creating a new table with 'product_id', 'last_viewed_product_id', 'hits' where when a user goes from a page for product_id=100 to product_id=101, it will create/update this table with product_id=101, last_viewed_product_id=100, and increment the 'hits' value. Are there better methods that are more optimized and less computationally intensive?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

对不⑦ 2024-11-15 06:33:59

如果您拥有所有访问者的用户 ID(您可以为未注册用户创建临时用户 ID),则可以创建一个包含 user_id 和 Product_id 列的历史记录表,其中存储用户访问过的所有产品。然后,当用户打开产品时,执行查询来搜索最近查看过该产品的 user_ids,然后将其加入到这些用户打开过的产品中。然后,只需对这些 user_ids 打开次数最多的产品进行排序即可。

确保缓存它,因为连接会减慢任何 SQL 服务器的速度。

If you have user IDs for all of your visitors (you can create temporary ones for unregistered users), you can create a history table with columns user_id and product_id, which stores all of the products users have visited. Then when a user opens a product, do a query that searches for the user_ids that have viewed that product recently and then join it to the products those users have opened. Then, just sort the products by which have been opened the most by those user_ids.

Make sure to cache this as the join would slow down any SQL server.

半世晨晓 2024-11-15 06:33:59

我很确定亚马逊为此使用关联规则

开创性论文:

http://dl.acm.org/itation.cfm?id=170072

快速算法(FP-Growth):

http://link.springer.com/chapter/10.1007/3 -540-47887-6_34#page-1

没见过PHP库,但是有Java、Python的库。

I'm pretty sure that Amazon uses Association Rules for this.

The seminal paper:

http://dl.acm.org/citation.cfm?id=170072

The fast algorithm (FP-Growth):

http://link.springer.com/chapter/10.1007/3-540-47887-6_34#page-1

Haven't seen a PHP library, but there are for Java, Python.

秋意浓 2024-11-15 06:33:58

据我所知,亚马逊用来降低计算强度的“技巧”是a)使用贝叶斯统计/平均值和b)计算部分聚合。后者允许您不需要计算所有内容(您可以对预先计算的聚合求和)。前者允许您注入您推断的相关材料。

Best I'm aware, the "tricks" used by Amazon to make things less computationally intensive is to a) use bayesian stats/averages and b) compute partial aggregates. The latter allows you to not need to count everything (you can instead sum pre-computed aggregates). The former allows you to inject what you infer will be related material.

刘备忘录 2024-11-15 06:33:58

看来你走在正确的道路上 - 一些建议 -

对于计算密集型 - 你可能想缓存你的结果,所以你只会给出一个顶部的“x”数字,该数字每天更新一次或类似影响。在这种情况下,实时性似乎并不重要。

我不确定您的网站上有什么类型的产品,但如果种类繁多,您可能只想显示具有相关信息的项目(因此星球大战只会弹出与星球大战相关的项目) 。

因此,如果您的产品或关键字有“标签”,您可能需要使用与其之间的关系。

您可能还想对他们如何获得产品进行权重。如果他们通过单击您提供的列表找到该产品,那么这些类型的项目将继续填充,并且不会给其他产品显示的机会,因此请赋予其较低的权重。较重的物品会弹出。

It seems you're going on the right path - a few suggestions -

For computationally intensive - you probably want to cache your results, so you'll only give out a top 'x' number which is updated once a day or similar to that effect. Real time does not seem significant in this case.

I'm not sure what sort of products you have on your site, but if the variety is significant, you might only want to put items that have related information to show up (so Star Wars would only have Star Wars related items popping up).

So if you have "tags" for your products, or keywords, you may want to use a relationship with that.

You may also want to create a weight on how they got to a product. If they got to the product by clicking on that list that you provided, then those type of items will continue to populate, and not give other products a chance to show up, so give it a low weight. The heavier items would pop up instead.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文