推荐系统:简单的基于用户的协同过滤,以精确度和召回率进行评估
我正在寻找一个非常简单的基于用户的Java实现协作过滤。我想用 movielens 数据集评估这个 CF 的精确度和召回率。我发现性能 (F1) 应该在 20% 到 30% 左右(使用 Pearson 相似度和 KNN)。
这个简单的框架是否存在用于评估精确度和召回率代码?
I'm looking for a very simple implementation in Java of a user-based collaborative filtering. I would like to evaluate the precision and recall of this CF with the movielens dataset. I've seen that the performance (F1) should be around 20 to 30% (with Pearson similarity, and KNN).
Does this simple framework exist with the evaluation for precision and recall code?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
Apache Mahout 可以完成您在此处提到的所有操作。它基于 Java,并通过
GenericUserBasedRecommender
支持基于用户的协作过滤(等等)。它是一种 k 最近邻算法,您可以在其中插入相似性实现,例如 PearsonCorrelationSimilarity 等。查看 org.apache.mahout.cf.taste 包和子包。在
.impl.eval
子包中找到GenericRecommenderIRStatsEvaluator
。这将运行一个报告精度、召回率和 F1 的测试。最后,
mahout-examples
中已经有一些基于GroupLens
的工作示例。Apache Mahout does everything you mention here. It is Java-based, and supports user-based collaborative filtering (among others) with
GenericUserBasedRecommender
. It is a k-nearest-neighbor algorithm, into which you can plug similarity implementations likePearsonCorrelationSimilarity
and others.Look at the
org.apache.mahout.cf.taste
package and subpackages. In the.impl.eval
subpackage findGenericRecommenderIRStatsEvaluator
. This will run a test that reports precision, recall and F1.Finally, there are already some working examples based on
GroupLens
inmahout-examples
.你尝试过RapidMiner吗?
如果您有兴趣,只需尝试评估精度和召回率之类的事情,而无需专注于编码。这就是适合您的工具。网络上有很多很好的信息,甚至纸张和 YouTube 视频教程之类的可以帮助您。
Have you tried RapidMiner?
If you are interested in just try things like evaluate the precision and recall without concentrate on coding. That's the tool for you. There's good information on the web even paper and youtube videos tutorial like to help you.