超越逐项推荐
简单的逐项推荐系统是众所周知的并且经常被实施。一个示例是Slope One 算法。如果用户还没有对很多项目进行评分,这很好,但是一旦他们评分了,我想提供更细粒度的推荐。让我们以音乐推荐系统为例,因为它们非常流行。如果用户正在观看莫扎特的作品,则可能会给出另一首莫扎特或贝多芬的作品的建议。但是,如果用户对古典音乐进行了多次评分,我们也许能够在项目之间建立关联,并看到用户不喜欢人声或某些乐器。我假设这将是一个由两部分组成的过程,第一部分是找到每个用户评分之间的相关性,第二部分是根据这些额外数据构建推荐矩阵。所以问题是,它们是否有可用于每个步骤的开源实现或论文?
Simple item-to-item recommendation systems are well-known and frequently implemented. An example is the Slope One algorithm. This is fine if the user hasn't rated many items yet, but once they have, I want to offer more finely-grained recommendations. Let's take a music recommendation system as an example, since they are quite popular. If a user is viewing a piece by Mozart, a suggestion for another Mozart piece or Beethoven might be given. But if the user has made many ratings on classical music, we might be able to make a correlation between the items and see that the user dislikes vocals or certain instruments. I'm assuming this would be a two-part process, first part is to find correlations between each users' ratings, the second would be to build the recommendation matrix from these extra data. So the question is, are they any open-source implementations or papers that can be used for each of these steps?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
味道可能有一些有用的东西。它已转移到 Mahout 项目:
http://taste.sourceforge.net/
一般来说,这个想法是给定用户过去的偏好,您希望预测他们接下来会选择什么并推荐它。您构建一个机器学习模型,其中输入是用户过去选择的内容以及每个选择的属性。输出是他们将选择的项目。您可以通过保留他们的一些选择来创建训练数据,并使用他们的历史记录来预测您保留的数据。
您可以使用许多不同的机器学习模型。决策树很常见。
Taste may have something useful. It's moved to the Mahout project:
http://taste.sourceforge.net/
In general, the idea is that given a user's past preferences, you want to predict what they'll select next and recommend it. You build a machine-learning model in which the inputs are what a user has picked in the past and the attributes of each pick. The output is the item(s) they'll pick. You create training data by holding back some of their choices, and using their history to predict the data you held back.
Lots of different machine learning models you can use. Decision trees are common.
一个答案是任何推荐系统都应该具有您所描述的一些属性。最初,建议不是很好,而且到处都是。当它了解品味时,推荐将来自用户喜欢的区域。
但是,您描述的协同过滤过程从根本上来说并不是试图解决您想要解决的问题。它基于用户评分,两首歌的评分相似并不是因为它们是相似的歌曲,而是因为相似的人喜欢它们而评分相似。
你真正需要的是定义歌曲相似度的概念。是根据歌曲的声音来决定的吗?作曲家?因为听起来这个概念实际上并不是基于收视率。这就是您要解决的问题的 80%。
我认为您真正要回答的问题是,哪些项目与给定项目最相似?考虑到您的项目相似性,这比推荐更容易解决。
Mahout 可以帮助解决所有这些问题,除了基于音频的歌曲相似性之外——或者至少提供一个您的解决方案的开始和框架。
One answer is that any recommender system ought to have some of the properties you describe. Initially, recommendations aren't so good and are all over the place. As it learns tastes, the recommendations will come from the area the user likes.
But, the collaborative filtering process you describe is fundamentally not trying to solve the problem you are trying to solve. It is based on user ratings, and two songs aren't rated similarly because they are similar songs -- they're rated similarly just because similar people like them.
What you really need is to define your notion of song-song similarity. Is it based on how the song sounds? the composer? Because it sounds like the notion is not based on ratings, actually. That is 80% of the problem you are trying to solve.
I think the question you are really answering is, what items are most similar to a given item? Given your item similarity, that's an easier problem than recommendation.
Mahout can help with all of these things, except song-song similarity based on its audio -- or at least provide a start and framework for your solution.
我能想到两种技术:
这些方法的共同特征是:
每个用户。这几乎是规则
高效的数据库查询
寻找建议。
当用户为某个项目投票时。
输入数据(例如有声音,节拍
每分钟,音阶,
无论如何)对于
分类的质量。
请注意,这些建议来自基于知识的系统和人工神经网络的大学课程,而不是来自实践经验。
There are two techniques that I can think of:
Common characteristics of these methods are:
each user. This pretty much rules
out efficient database queries when
searching for recommendations.
when the user votes for an item.
the input data (e.g. has vocals, beats
per minute, musical scales,
whatever) are very critical to the
quality of the classification.
Please note that these suggestions come from university courses in knowledge based systems and artificial neural nets, not from practical experience.