理解 lucene 中索引时提升文档与其搜索时相应分数之间的关系
建立索引时,我会提升某些文档,但它们不会出现在检索到的文档列表的顶部。我查看了这些文档的分数,不知何故,检索到的文档的分数始终为 NaN。
索引时文档的提升与其检索时的分数之间有什么关系?我认为这些是相关的,而且,我认为我会在我的分数文档中得到广泛的分数,而不仅仅是 NaN。如果您能对此有所了解,我将不胜感激。
我已阅读 http://lucene.apache。 org/java/2_3_2/api/org/apache/lucene/search/Similarity.html
并且无法弄清楚缺少什么。
这是简单的提升代码:
if (myCondition)
{
myDocument.SetBoost(1.1f);
}
myIndexWriter.AddDocument(document);
When indexing, I boost certain documents, but they do not appear on the top of the list of retrieved documents. I looked at the score of those documents, and somehow, the score of the documents retrieved is always NaN.
What is the relationship between a boost of a document at index time and its score at retrieve time? I thought these would be correlated, and further, I thought I would get a wide range of scores in my scoredocs, not just NaN. If you can shed some light on this I would be grateful.
I have read http://lucene.apache.org/java/2_3_2/api/org/apache/lucene/search/Similarity.html
and cant figure out what is missing.
Here is the simple boosting code:
if (myCondition)
{
myDocument.SetBoost(1.1f);
}
myIndexWriter.AddDocument(document);
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
由于您还没有提供搜索代码的示例,所以我将在这里进行大胆的猜测,但是检索文档的分数为 NaN 的一个常见原因是因为您使用了排序。排序时,大多数时候不使用文档的分数,因此默认禁用。
如果您使用排序进行搜索,并且想要分数,请检查
IndexSearcher
类的setDefaultFieldSortScoring
方法。此方法允许您在使用排序的搜索中对文档进行评分。http://lucene.apache.org/java/2_9_4/api/all/org/apache/lucene/search/IndexSearcher.html#setDefaultFieldSortScoring(boolean, boolean)
I'm gonna go on a wild guess here since you havent provide a sample of you search code, but a common reason why the score of retreived docs is NaN is because you use a Sort. When sorting, most of the time the score of the documents is not used, and therefore disabled by default.
If you use a Sort for your search, and want the score, check the method
setDefaultFieldSortScoring
of theIndexSearcher
class. This method allows you to enable scoring the documents in a search that uses a Sort.http://lucene.apache.org/java/2_9_4/api/all/org/apache/lucene/search/IndexSearcher.html#setDefaultFieldSortScoring(boolean, boolean)