操纵 NHibernate.Search 查询结果的分数/排名
我一直在使用 NHibernate、NHibernate.Search 和 Lucene.Net 来改进我开发的网站上使用的搜索引擎。
基本上,我用它来搜索公司规范文档的内容。这不要与 Lucene 的文档概念混淆:在我的例子中,一个规范文档(我将在下文中称为“specdoc”)可以包含许多页面,并且这些页面的内容是实际索引的内容(因此,页面本身属于 Lucene 的文档概念)。因此,这些页面属于一个规范文档,而规范文档又属于一个公司(因此,一个公司可以拥有许多规范文档)。我正在使用 NHibernate.Search“IndexEmbedded”和“ContainedIn”属性将页面与其specdoc以及specdoc与其公司相关联,因此我可以查询specdoc页面中的术语并让 Lucene/NH.Search 返回页面本身、规范文档或与页面上的查询匹配的公司。我可以通过这种方式查询并获得排名结果,从而按相关性呈现结果(即公司、规范文档或页面),这很棒。
但现在我需要更多的东西。特别是在我查询术语并让 NH.Search 返回匹配的公司的情况下,我需要手动/人为调整某些结果的分数,因为有些公司我想显示在结果的顶部设置-想想“赞助结果”。
我正在考虑在我的应用程序上执行此操作,也许创建一个实体/数据库表,其中包含与公司实体的关联以及分数提升值。但我不知道如何将其提供给 Lucene 并让它在搜索时相应地提高结果。最初我考虑派生一个相似度类来执行此操作,但看起来相似度不能用于在搜索时修改结果集。根据此页面,看起来我需要的是调整重量或得分。但这些文档有点肤浅,没有关于如何实现自定义评分的示例,更不用说将其与 NH.Search 集成了。
那么,有谁知道如何做到这一点,或者向我指出一些有关如何执行类似操作的文档或工作示例?
谢谢!
I've been working with NHibernate, NHibernate.Search and Lucene.Net to improve the search engine used on the website I develop.
Basically, I use it to search contents of corporations specification documents. This is not to be confused with Lucene's notion of documents: in my case, a specification document (which I'll hereafter call a "specdoc") can contain many pages, and the content of these pages are the ones that are actually indexed (thus, the pages themselves are the ones that fall into Lucene's concept of documents). So, the pages belong to a specdoc, that in turn belong to a corporation (so, a corporation can have many specdocs). I'm using NHibernate.Search "IndexEmbedded" and "ContainedIn" attributes to associate the pages with their specdoc and the specdocs to their corporations, so I can query for terms in specdoc pages and have Lucene/NH.Search return either the pages themselves, the specdocs, or the corporations that match the query on the pages. I can query this way and get ranked results, thus presenting results (that is, corporations, specdocs or pages) by relevance, which is great.
But now I need something more. Specifically in the case where I query terms and have NH.Search return the corporations that match, I need to manually/artificially tune the score of some of the results, because there are corporations that I want to show up on the top of the result set - think of "sponsored results".
I'm thinking of doing it on my application, maybe creating an entity/database table that contain an association to the corporation entity, and a score boost value. But I don't know how to feed this to Lucene and have it boost the results accordingly at search time. Initially I thought about deriving a Similarity class to do this, but it doesn't look like Similarity can be used to modify result sets at search time. As per this page, it looks like what I need is to mess around with weight or scoring. But the docs are a little superficial in that there are no examples on how to implement a custom scoring, let alone integrate it with NH.Search.
So, does anyone know how to do this, or point me to some documentation or working example on how to do something similar?
Thanks!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
据我了解,您只是希望能够在查询时而不是索引时设置提升。这可以轻松完成。当您构建查询时,您可以设置提升。 Query 对象包含一个 SetBoost 属性,允许您提升与整个查询匹配的文档。当您使用两个术语查询并且希望增强其中之一时,这非常有用。但是,如果您使用 QueryParser 之类的东西来构建查询,则查询解析器有一种语法可以设置术语的提升。有关更多信息,请参阅http://lucene.apache.org/ java/2_9_0/queryparsersyntax.html#Boosting%20a%20Term。现在,如果您正在使用查询解析器,您可以使用一些正则表达式或调整查询解析器字符串以添加附加符号来增强术语,或者您可以考虑创建自己的查询解析器,这将在决定时添加增强必须添加它。我创建了自己的查询解析器,因为这并不困难。以下是有关 http://openedu.ossreleasefeed.com 的一些信息/tutorials/apache-lucene-extending-the-queryparser/
From what i understand, you just want to be able to set a boost at query time, instead of index time. This can be done, easily. When you build you query, you can set the boost then. The Query object contains a SetBoost property that allows you to boost the documents that match the whole query. This is useful for when you are using two term queries and you want one of them to be boosted. But, if you are using something like QueryParser to build you queries, there is a syntax for query parser to set the boost for the terms. More about that here http://lucene.apache.org/java/2_9_0/queryparsersyntax.html#Boosting%20a%20Term. Now if you are using query parser, you could possible use some regex or adjust the query parser string to add in the additional symbol to boost a term or you can maybe look into creating your own query parser, which will add the boost when it decides it must be added. I've created my own query parser because, and it isn't that difficult. Here is some information about that http://openedu.ossreleasefeed.com/tutorials/apache-lucene-extending-the-queryparser/