如何使我的推荐引擎适应冷启动?

发布于 2024-08-16 19:08:25 字数 101 浏览 12 评论 0原文

我很好奇克服“冷启动”问题的方法/途径是什么,当新用户或项目进入系统时,由于缺乏有关该新实体的信息,做出推荐是一个问题。

我可以考虑做一些基于预测的推荐(如性别、国籍等)。

I am curious what are the methods / approaches to overcome the "cold start" problem where when a new user or an item enters the system, due to lack of info about this new entity, making recommendation is a problem.

I can think of doing some prediction based recommendation (like gender, nationality and so on).

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

诺曦 2024-08-23 19:08:25

您可以冷启动推荐系统。

有两种类型的推荐系统;协同过滤和基于内容的。基于内容的系统使用有关您推荐的事物的元数据。那么问题是哪些元数据是重要的?第二种方法是协作过滤,它不关心元数据,它只使用人们对某个项目所做或所说的内容来做出推荐。通过协作过滤,您不必担心元数据中的哪些术语很重要。事实上,您不需要任何元数据来做出推荐。协同过滤的问题是你需要数据。在获得足够的数据之前,您可以使用基于内容的推荐。您可以提供基于这两种方法的推荐,并且一开始是 100% 基于内容的,然后当您获得更多数据时开始混合基于协作过滤的推荐。
这是我过去用过的方法。

另一种常见的技术是将基于内容的部分视为简单的搜索问题。您只需将元数据作为文档的文本或正文输入,然后为文档建立索引即可。您可以使用 Lucene & 来做到这一点。 Solr 无需编写任何代码。

如果你想了解基本的协同过滤是如何工作的,请查看 Toby Segaran 的《集体智能编程》的第二章

You can cold start a recommendation system.

There are two type of recommendation systems; collaborative filtering and content-based. Content based systems use meta data about the things you are recommending. The question is then what meta data is important? The second approach is collaborative filtering which doesn't care about the meta data, it just uses what people did or said about an item to make a recommendation. With collaborative filtering you don't have to worry about what terms in the meta data are important. In fact you don't need any meta data to make the recommendation. The problem with collaborative filtering is that you need data. Before you have enough data you can use content-based recommendations. You can provide recommendations that are based on both methods, and at the beginning have 100% content-based, then as you get more data start to mix in collaborative filtering based.
That is the method I have used in the past.

Another common technique is to treat the content-based portion as a simple search problem. You just put in meta data as the text or body of your document then index your documents. You can do this with Lucene & Solr without writing any code.

If you want to know how basic collaborative filtering works, check out Chapter 2 of "Programming Collective Intelligence" by Toby Segaran

情绪少女 2024-08-23 19:08:25

也许有时候您不应该提出建议? “数据不足”应该属于这些时间之一。

我只是不明白基于“性别、国籍等”的预测建议将不仅仅是刻板印象。

IIRC,亚马逊等地方在推出推荐之前已经建立了一段时间的数据库。这不是那种你想出错的事情;有很多关于基于数据不足的不恰当建议的故事。

Maybe there are times you just shouldn't make a recommendation? "Insufficient data" should qualify as one of those times.

I just don't see how prediction recommendations based on "gender, nationality and so on" will amount to more than stereotyping.

IIRC, places such as Amazon built up their databases for a while before rolling out recommendations. It's not the kind of thing you want to get wrong; there are lots of stories out there about inappropriate recommendations based on insufficient data.

渡你暖光 2024-08-23 19:08:25

我自己正在研究这个问题,但是微软关于玻尔兹曼机器的这篇论文看起来很值得: http:// Research.microsoft.com/pubs/81783/gunawardana09__unified_approac_build_hybrid_recom_system.pdf

Working on this problem myself, but this paper from microsoft on Boltzmann machines looks worthwhile: http://research.microsoft.com/pubs/81783/gunawardana09__unified_approac_build_hybrid_recom_system.pdf

雄赳赳气昂昂 2024-08-23 19:08:25

这个问题之前已经被问过好几次了(当然,我现在找不到这些问题了:/,但总体结论是最好避免这样的建议。在世界的不同地方,相同的名字属于不同的性别,等等......

This has been asked several times before (naturally, I cannot find those questions now :/, but the general conclusion was it's better to avoid such recommendations. In various parts of the worls same names belong to different sexes, and so on ...

疯狂的代价 2024-08-23 19:08:25

基于“类似用户喜欢...”的推荐显然必须等待。如果您绝对致力于根据用户相似性进行预测,则可以发放优惠券或其他激励措施来调查受访者。

还有两种冷启动推荐引擎的方法。

  1. 自己建立一个模型。
  2. 让您的供应商将关键信息填写到骨架模型中。 (还可能需要美元激励。)

所有这些都存在很多潜在的陷阱,这些都是常识,无法提及。

正如您所料,这里没有免费的午餐。但这样想:推荐引擎不是商业计划。他们只是增强了商业计划。

Recommendations based on "similar users liked..." clearly must wait. You can give out coupons or other incentives to survey respondents if you are absolutely committed to doing predictions based on user similarity.

There are two other ways to cold-start a recommendation engine.

  1. Build a model yourself.
  2. Get your suppliers to fill in key information to a skeleton model. (Also may require $ incentives.)

Lots of potential pitfalls in all of these, which are too common sense to mention.

As you might expect, there is no free lunch here. But think about it this way: recommendation engines are not a business plan. They merely enhance the business plan.

时光暖心i 2024-08-23 19:08:25

解决冷启动问题需要做三件事:

  1. 数据必须经过分析,以便具有许多不同的功能(对于产品数据,用于“功能”的术语通常是“分类方面”)。如果您在数据进来时没有正确分析数据,您的推荐引擎将保持“冷”状态,因为它无法对推荐进行分类。

  2. 最重要的是:您需要一个用户反馈循环,用户可以通过该循环查看个性化引擎的建议。例如,“此建议有帮助吗?”的“是/否”按钮应该将对一个训练数据集(即“推荐”训练数据集)中的参与者的评论排队到另一个训练数据集(即“不推荐”训练数据集)。

  3. 用于(推荐/不推荐)建议的模型绝不应被视为一刀切的建议。除了对产品或服务进行分类以向客户推荐之外,公司如何对每个特定客户进行分类也很重要。如果功能正常,我们应该期望具有不同功能的客户在给定情况下会得到不同的建议(推荐/不推荐)。这将是个性化引擎的“个性化”部分。

There are three things needed to address the Cold-Start Problem:

  1. The data must have been profiled such that you have many different features (with product data the term used for 'feature' is often 'classification facets'). If you don't properly profile data as it comes in the door, your recommendation engine will stay 'cold' as it has nothing with which to classify recommendations.

  2. MOST IMPORTANT: You need a user-feedback loop with which users can review the recommendations the personalization engine's suggestions. For example, Yes/No button for 'Was This Suggestion Helpful?' should queue a review of participants in one training dataset (i.e. the 'Recommend' training dataset) to another training dataset (i.e. DO NOT Recommend training dataset).

  3. The model used for (Recommend/DO NOT Recommend) suggestions should never be considered to be a one-size-fits-all recommendation. In addition to classifying the product or service to suggest to a customer, how the firm classifies each specific customer matters too. If functioning properly, one should expect that customers with different features will get different suggestions for (Recommend/DO NOT Recommend) in a given situation. That would the 'personalization' part of personalization engines.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文