书店应用程序的推荐系统

发布于 2024-10-19 10:47:23 字数 455 浏览 3 评论 0原文

嘿，我正在尝试学习 Amazon.com 等网站中使用的一些推荐算法。所以我有这个简单的java（spring hibernate postgres）书店应用程序，其中Book具有属性标题，类别，标签，作者。为了简单起见，书中没有任何内容。一本书必须通过标题、类别、作者和标签来标识。对于每个登录该应用程序的用户，我应该能够推荐一些书籍。每个用户都可以查看一本书，将其添加到购物车并随时购买。因此，在数据库中，我存储每个用户查看一本书的次数、购物车中的书籍以及用户购买的书籍。目前没有评级选项，但也可以添加。

那么有人可以告诉我可以使用哪些算法来向每个用户展示一些书籍推荐吗？我想让事情变得非常简单。它不是一个出售的项目，只是为了扩展我在推荐算法方面的知识。因此，假设总共只有大约 30 本书（5 个类别，每个类别 6 本书）。如果有人还可以告诉我应该使用哪些属性来计算两个用户之间的相似性以及如何使用推荐的算法来处理它，那将非常有帮助。

提前致谢。血清素追逐。

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

任性一次 2024-10-26 10:47:23

您可以在此处找到所有信息和常见算法的实现（Taste框架）库。

《行动中的集体智慧》是除了其他发帖者建议之外我还可以推荐的另一本书

回复收藏 0 原文

古镇旧梦 2024-10-26 10:47:23

作为一个特定的具体示例，一种选择是“最近 K 邻居”算法。

为了简化事情，假设您只有十本书，并且您只跟踪每个用户查看每本书的次数。然后，对于每个用户，您可能有一个数组 int timesViewed[10]，其中 timesViewed[i] 的值是用户查看书号的次数我。

然后，您可以使用相关函数将该用户与所有其他用户进行比较，例如例如皮尔逊相关性。计算当前用户 c 和另一个用户 o 之间的相关性给出 -1.0 和 1.0 之间的值，其中 -1.0 表示“此用户 c与其他用户o完全相反”，1.0 表示“该用户c与其他用户o相同”。

如果计算 c 与每个其他用户之间的相关性，您将获得一个结果列表，显示该用户的观看模式与每个其他用户的观看模式的相似程度。然后，您选择 K（例如 5、10、20）个最相似的结果（算法的名称由此而来），即相关性分数最接近的 K 个用户到 1.0。

现在，您可以对每个用户的 timesViewed 数组进行加权平均。例如，我们可以说 averageTimesViewed[0] 是这 K 个用户中每个用户的 timesViewed[0] 的平均值，并按其相关性得分进行加权。然后对彼此执行相同的操作averageTimesViewed[i]。

现在您有一个数组 averageTimesViewed，粗略地说，它包含与 c 观看模式最相似的 K 个用户查看每本书的平均次数。推荐 averageTimesViewed 得分最高的图书，因为这是其他用户最感兴趣的图书。

通常也值得将用户已经看过的图书排除在推荐之外，但它仍然是在计算相似性/相关性时，重要的是要考虑到这些因素。

另请注意，这可以简单地扩展以考虑其他数据（例如购物车列表等）。此外，如果需要，您可以选择所有用户（即K = 用户数量），但这并不总是产生有意义的结果，并且通常会选择相当小的< code>K 足以获得良好的结果，并且计算速度更快。

As a particular concrete example, one option is a "nearest K neighbours" algorithm.

To simplify things, imagine you only had ten books, and you were only tracking how many times each user viewed each book. Then, for each user, you might have an array int timesViewed[10], where the value of timesViewed[i] is the number of times the user has viewed book number i.

You can then compare the user to all of the other users using a correlation function, such as the Pearson correlation for example. Computing the correlation between the current user c and another user o gives a value between -1.0 and 1.0, where -1.0 means "this user c is the complete opposite of the other user o", and 1.0 means "this user c is the same as the other user o".

If you compute the corellation between c and every other user, you get a list of results of how similar the user's viewing pattern is to that of each other user. You then pick the K (e.g. 5, 10, 20) most similar results (hence the name of the algorithm), that is, the K users with the correlation scores closest to 1.0.

Now, you can do a weighted average of each of those user's timesViewed arrays. For example, we'll say averageTimesViewed[0] is the average of the timesViewed[0] for each of those K users, weighted by their correlation score. Then do the same for each other averageTimesViewed[i].

Now you have an array averageTimesViewed which contains, roughly speaking, the average number of times the K users with the most similar viewing patterns to c has viewed each book. Recommend the book which has the highest averageTimesViewed score, since this is the book the other users have shown most interest in.

It's usually worth also excluding books the user has already viewed from being recommended, but it is still important to keep those accounted for when computing similarity/correlation.

Also note that this can be trivially extended to take other data into account (such as cart lists etc). Also, you can select all users if you want (i.e. K = number of users), but that doesn't always produce meaningful results, and usually picking a reasonably small K is sufficient for good results, and is quicker to compute.

回复收藏 0 原文