如何实现推荐引擎?

发布于 2024-10-03 20:20:15 字数 369 浏览 4 评论 0原文

请耐心等待我的写作,因为我的英语不熟练。

作为一名程序员,我想了解在推荐系统或相关系统下实现的算法或机器学习智能。例如,最明显的例子来自亚马逊。他们有一个非常好的推荐系统。他们会知道:如果您喜欢这个,您可能也会喜欢那个,或者类似的东西:喜欢这个和<的人所占的比例是多少em>那在一起。

当然我知道亚马逊是一个大网站,他们在这些系统上投入了大量的人力和金钱。但是,在最基本的核心上,我们如何在数据库中实现类似的功能?我们如何识别一个对象与其他对象的关系?我们怎样才能建立一个统计单元来处理这种事情呢?

如果有人能指出一些算法,我将不胜感激。或者,基本上,指出一些我们都可以学习的好的直接参考资料/书籍。谢谢大家!

Please be patient with my writing, as my English is not proficient.

As a programmer, I wanna learn about the algorithm, or the machine learning intelligence, that are implemented underneath recommendation systems or related-based systems. For instance, the most obvious example would be from Amazon. They have a really good recommendation system. They get to know: if you like this, you might also like that, or something else like: What percentage of people like this and that together.

Of course I know Amazon is a big website and they invested a lot of brain and money into these systems. But, on the very basic core, how can we implement something like that within our database? How can we identify how one object relates to other? How can we build a statistic unit that handles this kind of thing?

I'd appreciate if someone can point out some algorithms. Or, basically, point out some good direct references/ books that we can all learn from. Thank you all!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

挖个坑埋了你 2024-10-10 20:20:15

有两种不同类型的推荐引擎。

最简单的是基于项目,即“购买产品 A 的客户也购买了产品 B”。这很容易实现。存储稀疏对称矩阵 nxn(其中 n 是项目数)。每个元素 (m[a][b]) 是任何人购买商品“a”和商品“b”的次数。

另一个是基于用户的。那就是“像你这样的人常常喜欢这样的事情”。该问题的一个可能的解决方案是 k 均值聚类。即构建一组集群,将具有相似品味的用户放置在同一集群中,并根据同一集群中的用户提出建议。

一种更好但更复杂的解决方案是一种称为“受限玻尔兹曼机”的技术。 此处有对它们的介绍

The are 2 different types of recommendation engines.

The simplest is item-based ie "customers that bought product A also bought product B". This is easy to implement. Store a sparse symmetrical matrix nxn (where n is the number of items). Each element (m[a][b]) is the number of times anyone has bought item 'a' along with item 'b'.

The other is user-based. That is "people like you often like things like this". A possible solution to this problem is k-means clustering. ie construct a set of clusters where users of similar taste are placed in the same cluster and make suggestions based on users in the same cluster.

A better solution, but an even more complicated one is a technique called Restricted Boltzmann Machines. There's an introduction to them here

浮生未歇 2024-10-10 20:20:15

第一次尝试可能如下所示:

//First Calculate how often any product pair was bought together
//The time/memory should be about Sum over all Customers of Customer.BoughtProducts^2
Dictionary<Pair<ProductID,ProductID>> boughtTogether=new Dictionary<Pair<ProductID,ProductID>>();
foreach(Customer in Customers)
{
    foreach(product1 in Customer.BoughtProducts)
        foreach(product2 in Customer.BoughtProducts)
            {
                int counter=boughtTogether[Pair(product1,product2)] or 0 if missing;
                counter++;
                boughtTogether[Pair(product1,product2)]=counter;
            }
}

boughtTogether.GroupBy(entry.Key.First).Select(group.OrderByDescending(entry=>entry.Value).Take(10).Select(new{key.Second as ProductID,Value as Count}));

首先,我计算每对产品一起购买的频率,然后按产品对它们进行分组,并选择与其一起购买的前 20 个其他产品。结果应该放入某种由产品 ID 键入的字典中。

对于大型数据库来说,这可能会变得太慢或消耗太多内存。

A first attempt could look like this:

//First Calculate how often any product pair was bought together
//The time/memory should be about Sum over all Customers of Customer.BoughtProducts^2
Dictionary<Pair<ProductID,ProductID>> boughtTogether=new Dictionary<Pair<ProductID,ProductID>>();
foreach(Customer in Customers)
{
    foreach(product1 in Customer.BoughtProducts)
        foreach(product2 in Customer.BoughtProducts)
            {
                int counter=boughtTogether[Pair(product1,product2)] or 0 if missing;
                counter++;
                boughtTogether[Pair(product1,product2)]=counter;
            }
}

boughtTogether.GroupBy(entry.Key.First).Select(group.OrderByDescending(entry=>entry.Value).Take(10).Select(new{key.Second as ProductID,Value as Count}));

First I calculate how often each pair of products was bought together, and then I group them by the product and select the top 20 other products bought with it. The result should be put into some kind of dictionary keyed by product ID.

This might get too slow or cost too much memory for large databases.

佞臣 2024-10-10 20:20:15

我认为,您谈论的是知识库系统。我不记得编程语言(也许是 LISP),但有实现。另外,请查看OWL

I think, you talk about knowledge base systems. I don't remember the programming language (maybe LISP), but there is implementations. Also, look at OWL.

情场扛把子 2024-10-10 20:20:15

如果您正在寻找开源解决方案或 SaaS 解决方案,例如 prediction.io,也可以使用 prediction.io /mag3llan.com" rel="nofollow">mag3llan.com。

There's also prediction.io if you're looking for an open source solution or SaaS solutions like mag3llan.com.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文