查找具有评分值的用户最喜欢的项目
假设用户以 1 到 5 的等级对一些电影进行投票。这些电影具有流派信息,并且一部电影可以有多个流派。像这样:
Movie A Rating 4
Action/Sci-Fi
Movie B Rating 5
Comedy/Action
Movie C Rating 4
Comedy/Drama
我们想了解用户喜欢哪种类型。这里我们有我们的结果集:
Genre Movie_Count Average_Rating
----------
Action 2 5
Comedy 2 4.5
SciFi 1 4
Drama 1 4
显然,我们无法用如此小的结果集来预测任何事情,但让我们假设我们有一个更大的数据集。
使用这些数据,我们如何对该用户最喜欢的类型进行排序?简单地计算加权平均值还是更复杂的东西?
Let's assume that a user votes for some movies in a scale of 1 to 5. These movies has genre info, and a movie can have more than one genre. Like this:
Movie A Rating 4
Action/Sci-Fi
Movie B Rating 5
Comedy/Action
Movie C Rating 4
Comedy/Drama
We want to learn which genre likes our user. Here we have our result set:
Genre Movie_Count Average_Rating
----------
Action 2 5
Comedy 2 4.5
SciFi 1 4
Drama 1 4
Obviously, we cannot predict anything with such a small resultset, but let us assume that we've a larger dataset.
Using this data, how can we sort most liked genres of this user? Simply calculating weighted average or something more complex?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我在这里看到的主要问题是:
用户评价 1000 部喜剧电影,平均得分为 4
用户评价 10 部动作电影,平均得分为 4.1
您如何排序?
请参阅http://www.evanmiller.org/how- not-to-sort-by-average- rating.html 用于讨论和一种可能的解决方案。
另一个问题是:
如果一部电影既是喜剧又是动作,并且评分为 4.0,那么它是喜剧还是动作的评分是多少?
您可以使用期望最大化来解决此问题 http://en.wikipedia.org/wiki/期望%E2%80%93maximization_algorithm。
The main problem I see here is:
User rates 1000 comedy movies with average score of 4
User rates 10 action movies with average score of 4.1
How do you order them?
See http://www.evanmiller.org/how-not-to-sort-by-average-rating.html for discussion and one possible solution.
Another problem would be:
If a movie is both comedy and action, and was given a rating of 4.0, how much was it because it is comedy or action ?
You can solve this using expectation maximization http://en.wikipedia.org/wiki/Expectation%E2%80%93maximization_algorithm .