我目前正在升级项目的春季启动版本。从2.5升级到2.6后,一些测试开始失败,这涉及Elasticsearch文档的检索。我试图仅获取最高得分文件,但是当期望2个相同的文档时,只会检索1个。
在阅读了问题后,我发现问题归因于使用多个碎片的ElasticsearchIndex,每个碎片都有自己的评分逻辑,并且(可能?)相同的文档是从不同的碎片中获取的,因此,尽管实际上是不同的分数相同的。
现在,谁能告诉我为什么这在较新的春季data-elasticsearch版本中会发生这种情况,以及是否有设置将其返回到旧功能?
我已经设置了一个小测试项目,可以解决这个问题。如果有人有兴趣自己尝试此操作,请随时检查一下: https:// github .com/Moldavis/Elasticsearch-Scoring-Poc
I'm currently in the process of upgrading the spring boot version of my project. After upgrading from 2.5 to 2.6 a few tests started failing which deal with the retrival of elasticsearch documents. I'm trying to fetch only the highest scoring documents, but when expecting 2 identical documents, only 1 is retrieved.
After reading up on the issue I figured out that the problem comes down to the Elasticsearchindex using multiple shards, each having their own scoring logic and (probably?) the identical documents being fetched from different shards, thus resulting in different scores despite being virtually the same.
Now, can anyone tell me why this happens in the newer spring-data-elasticsearch version and if there is a setting to return it to the old functionality?
I've set up a little test project to play around with this. If anyone is interested in trying this for themselves, feel free to check it out: https://github.com/Moldavis/elasticsearch-scoring-poc
发布评论
评论(1)
实际上,在春季数据中断更改文档(DUH)中找到了我自己的答案。
https://docs.spring.io/spring-data/elasticsearch/docs/current/referent/referent/html/html/#elasticsearch-migration-migration-guide-guide-guide-4.2-4.3.breaking-changes
文档和期限频率等于不同碎片之间的分数。默认情况下,这不再使用,因此发生了上述问题。
可以通过为查询设置搜索类型来修复它:
Actually found my own answer in the spring data breaking changes documentation (duh).
https://docs.spring.io/spring-data/elasticsearch/docs/current/reference/html/#elasticsearch-migration-guide-4.2-4.3.breaking-changes
The dfs_query_then_fetch option queries all shards for document and term frequency to equal out the score between different shards. This is no longer used by default, therefore the mentioned problem occurs.
It can be fixed by setting the searchtype for the query like so: