Lucene 负载评分
我想弄清楚负载评分在 lucene 中是如何工作的。由于我不明白 PayloadFunction 的作用,我想我不太明白它是如何工作的。尝试用谷歌搜索它,但除了浏览源代码的建议之外找不到太多。好吧,如果有人可以在这里解释它,那就太好了,否则就是源代码:)
I want to figure out how payload scoring works in lucene. Since I don't understand where PayloadFunction fits in, I think I don't really understand how it works. Tried googling for it, but couldn't find much apart from advice to go through source. Well, it would be nice if someone can explain it here, else source code it is :)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
它分为三个部分。首先,您应该在分析过程中生成有效负载。这可以使用 PayloadAttribute 来完成。您只需在分析过程中将此属性添加到所需的术语中即可。
然后在搜索过程中您应该使用特殊的查询类
PayloadTermQuery
。此类的行为与SpanTermQuery
相同,但会跟踪索引中的有效负载。使用自定义Similarity
实现,您可以对文档中每个有效负载的出现进行评分。最后,使用 PayloadFunction,您可以聚合文档上的有效负载分数以生成最终文档分数。
There are three parts of it. First of all you should generate payloads during analysis. This could be done using
PayloadAttribute
. You just need to add this attribute to terms you want during analysis.Then during searching you should use special query class
PayloadTermQuery
. This class behaves asSpanTermQuery
but do track of payloads in index. Using customSimilarity
implementation you could score each payload occurrence in document.Finally, using
PayloadFunction
you could aggregate payload scores over document to produce final document score.