推荐产品的算法
有什么好的算法可以根据某人之前的选择来建议他们可能喜欢的东西? (例如,亚马逊流行推荐书籍,并用于 iRate Radio 或 YAPE 等服务,您可以通过对项目进行评分来获取建议)
What's a good algorithm for suggesting things that someone might like based on their previous choices? (e.g. as popularised by Amazon to suggest books, and used in services like iRate Radio or YAPE where you get suggestions by rating items)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(9)
简单明了(订单购物车):
根据一起订购的商品保留交易列表。 例如,当有人在亚马逊上购买便携式摄像机时,他们也会同时购买用于录制的媒体。
在决定给定产品页面上的“建议”内容时,请查看订购该产品的所有订单,计算同时购买的所有其他商品,然后显示同时购买最频繁的前 5 件商品时间。
您不仅可以根据订单,还可以根据人们在网站上按顺序搜索的内容等来扩展它。
就评级系统(即电影评级)而言:
当你输入收视率。 您拥有的不是人们购买过的一篮子离散商品,而是商品评级的客户历史记录。
此时您正在考虑数据挖掘,其复杂性是巨大的。
然而,一个简单的算法与上面的算法相差不远,但它采用了不同的形式。 获取客户评分最高的项目和评分最低的项目,并找到具有类似最高评分和最低评分列表的其他客户。 你想将他们与具有类似极端好恶的其他人匹配 - 如果你只关注喜欢,那么当你建议他们讨厌的东西时,你会给他们带来糟糕的体验。 在建议系统中,你总是想犯“不冷不热”的体验而不是“讨厌”的错误,因为一次糟糕的体验会让他们对使用建议感到厌烦。
向客户推荐其他人最高列表中的项目。
Simple and straightforward (order cart):
Keep a list of transactions in terms of what items were ordered together. For instance when someone buys a camcorder on Amazon, they also buy media for recording at the same time.
When deciding what is "suggested" on a given product page, look at all the orders where that product was ordered, count all the other items purchased at the same time, and then display the top 5 items that were most frequently purchased at the same time.
You can expand it from there based not only on orders, but what people searched for in sequence on the website, etc.
In terms of a rating system (ie, movie ratings):
It becomes more difficult when you throw in ratings. Rather than a discrete basket of items one has purchased, you have a customer history of item ratings.
At that point you're looking at data mining, and the complexity is tremendous.
A simple algorithm, though, isn't far from the above, but it takes a different form. Take the customer's highest rated items, and the lowest rated items, and find other customers with similar highest rated and lowest rated lists. You want to match them with others that have similar extreme likes and dislikes - if you focus on likes only, then when you suggest something they hate, you'll have given them a bad experience. In suggestions systems you always want to err on the side of "lukewarm" experience rather than "hate" because one bad experience will sour them from using the suggestions.
Suggest items in other's highest lists to the customer.
考虑查看“什么是一个好的推荐算法?”及其在 Hacker News 上的讨论。
Consider looking at "What is a Good Recommendation Algorithm?" and its discussion on Hacker News.
没有明确的答案,而且不太可能有标准算法。
如何做到这一点在很大程度上取决于您想要关联的数据类型及其组织方式。 这取决于您在应用程序范围内如何定义“相关”。
通常最简单的想法会产生好的结果。 就书籍而言,如果您有一个数据库,每个书籍条目都有多个属性(例如作者、日期、流派等),您可以简单地选择推荐来自同一作者、相同流派、相似标题和内容的随机书籍集。其他人也这样。
然而,你总是可以尝试更复杂的东西。 记录需要该“产品”的其他用户,并推荐这些用户过去需要的其他“产品”(产品可以是任何东西,从一本书、一首歌到任何你能想象到的东西)。 大多数具有建议功能的主要网站都会这样做(尽管它们可能会吸收大量信息,从产品属性到人口统计数据,以便最好地为客户服务)。
或者你甚至可以求助于所谓的人工智能; 可以构建神经网络,吸收产品的所有这些属性,并尝试(基于之前的观察)将其与其他属性相关联并进行自我更新。
这些案例的组合可能适合您。
我个人建议考虑一下您希望算法如何工作以及如何建议相关“产品”。 然后,您可以探索所有选项:从简单到复杂并平衡您的需求。
There isn't a definitive answer and it's highly unlikely there is a standard algorithm for that.
How you do that heavily depends on the kind of data you want to relate and how it is organized. It depends on how you define "related" in the scope of your application.
Often the simplest thought produces good results. In the case of books, if you have a database with several attributes per book entry (say author, date, genre etc.) you can simply chose to suggest a random set of books from the same author, the same genre, similar titles and others like that.
However, you can always try more complicated stuff. Keeping a record of other users that required this "product" and suggest other "products" those users required in the past (product can be anything from a book, to a song to anything you can imagine). Something that most major sites that have a suggest function do (although they probably take in a lot of information, from product attributes to demographics, to best serve the client).
Or you can even resort to so called AI; neural networks can be constructed that take in all those are attributes of the product and try (based on previous observations) to relate it to others and update themselves.
A mix of any of those cases might work for you.
I would personally recommend thinking about how you want the algorithm to work and how to suggest related "products". Then, you can explore all the options: from simple to complicated and balance your needs.
如今,推荐产品算法是一项巨大的业务。 Netflix 提供 100,000 美元,但其算法的准确性仅略有提高。
Recommended products algorithms are huge business now a days. NetFlix for one is offering 100,000 for only minor increases in the accuracy of their algorithm.
正如您迄今为止从答案中推断出的那样,实际上也正如您所建议的那样,这是一个庞大而复杂的主题。 我无法给你答案,至少没有什么已经说过的,但我向你推荐几本关于这个主题的优秀书籍:
编程 CI:
http://oreilly.com/catalog/9780596529321/
是一个相当温和的介绍
Python 示例。
CI 的实际应用:
http://www.manning.com/alag 看起来
更深入一点(但我只是
阅读第一章或第二章)并已
Java 中的示例。
As you have deduced by the answers so far, and indeed as you suggest, this is a large and complex topic. I can't give you an answer, at least nothing that hasn't already been said, but I an point you to a couple of excellent books on the topic:
Programming CI:
http://oreilly.com/catalog/9780596529321/
is a fairly gentle introduction with
samples in Python.
CI In Action:
http://www.manning.com/alag looks a
bit more in depth (but I've only just
read the first chapter or 2) and has
examples in Java.
我认为用谷歌搜索一下最小均方回归(或类似的东西)可能会给你一些值得思考的东西。
I think doing a Google on Least Mean Square Regression (or something like that) might give you something to chew on.
我认为大多数有用的建议已经被提出了,但我想我只是想一下我将如何去做,因为我还没有做过这样的事情。
首先,我会找到应用程序中的何处对要使用的数据进行采样,因此如果我有商店,它可能会在结账时进行。 然后我会保存结帐购物车中每个项目之间的关系。
现在,如果用户转到项目页面,我可以计算与其他项目的关系数,并选择与所选项目关系数最多的 5 个项目。
我知道这很简单,而且可能有更好的方法。
但我希望它有帮助
I think most of the useful advice has already been suggested but I thought I'll just put in how I would go about it, just thinking though, since I haven't done anything like this.
First I Would find where in the application I will sample the data to be used, so If I have a store it will probably in the check out. Then I would save a relation between each item in the checkout cart.
now if a user goes to an items page I can count the number of relations from other items and pick for example the 5 items with the highest number of relation to the selected item.
I know its simple, and there are probably better ways.
But I hope it helps
市场购物篮分析是您正在寻找的研究领域:
Microsoft 通过其分析服务器提供了两种合适的算法:
Microsoft 关联算法 Microsoft 决策树算法
查看这篇 msdn 文章,了解有关如何最好地使用 Analysis Services 解决此问题的建议。
链接文本
Market basket analysis is the field of study you're looking for:
Microsoft offers two suitable algorithms with their Analysis server:
Microsoft Association Algorithm Microsoft Decision Trees Algorithm
Check out this msdn article for suggestions on how best to use Analysis Services to solve this problem.
link text
亚马逊创建了一个名为 Certona 的推荐平台,您可能会发现这个很有用,它被 B&Q 和 Screwfix 等公司使用,更多信息请访问 www.certona.com/
there is a recommendation platform created by amazon called Certona, you may find this useful, it is used by companies such as B&Q and Screwfix find more information at www.certona.com/