开源全文文章推荐引擎
我想知道开源项目中是否有任何好的.NET 推荐算法可用,无论是否附加到搜索引擎。 我所说的推荐是指接受全文文章并根据关键字相似度从其索引中推荐其他文章。
在高端,有像 Autonomy 这样的文档分类引擎; 在低端垃圾邮件过滤器和博客“相关帖子”小部件中。 也可能是广告与文章的匹配。 我想将一个整合到一个项目中,但买不起高端产品,而低端产品似乎都是基于 LAMP 的。
[抱歉,一个答案要求澄清:我正在寻找的理想情况是一个独立的库,但我愿意根据需要改编良好的源代码。 最终结果是我需要能够创建一个 C# 服务,该服务接受任意数量的文本并返回类似的先前索引文章的列表。 基本上,这正是 StackOverflow 本身在您提交问题时所做的事情!]
谢谢! 史蒂夫
I'm wondering if there are any good .NET recommendation algorithms available in open source projects, whether attached to a search engine or not. By recommendation I mean something that accepts a full-text article and recommends other articles from its index based on keyword similarity.
At the high end there are document classification engines like Autonomy; at the low-end spam filters and blog "related posts" widgets. Possibly advertisement-to-article matching, too. I'd like to incorporate one into a project but can't afford the high end and the low end seems to all be LAMP-based.
[Sorry, one answer asked for clarification: What I'm looking for is ideally a standalone library, but I'm willing to adapt good source code as necessary. The end result is that I need to be able to create a C# service that accepts an arbitrary amount of text and returnsa list of similar previously-indexed articles. Basicallly, the exact thing that StackOverflow itself does as you are submitting a question!]
Thanks!
Steve
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
我认为在 StackOverflow 中,他们从文本中提取所有常见的英语单词,然后将这些单词与其他帖子的剩余单词进行比较,以获得“相关”帖子。
I think that in StackOverflow they extract all common english words from the text and then compare this words with the remaining words of other posts to get the "Related" posts.
问题不是很清楚(算法还是库???),但唯一想到的是 Lucene.NET,流行的 Lucene 库在 .Net 框架上的移植。 HTH。
Question is not very clear (algorithm or library???) but only thing that comes to mind is Lucene.NET, the porting of the popular Lucene library on the .Net framework. HTH.