Lucene关于否定的嵌套查询评估
我将 Apache Lucene 支持添加到 Querydsl (为 Java 提供类型安全查询),我遇到了问题了解 Lucene 如何评估查询,特别是关于嵌套查询中的否定。
例如,我认为以下两个查询在语义上是相同的,但只有第一个查询返回结果。
+year:1990 -title:"Jurassic Park"
+year:1990 +(-title:"Jurassic Park")
第二个示例中的简化对象树如下所示。
query : Query
clauses : ArrayList
[0] : BooleanClause
"MUST" occur : BooleanClause.Occur
"year:1990" query : TermQuery
[1] : BooleanClause
"MUST" occur : BooleanClause.Occur
query : BooleanQuery
clauses : ArrayList
[0] : BooleanClause
"MUST_NOT" occur : BooleanClause.Occur
"title:"Jurassic Park"" query : TermQuery
Lucene 自己的QueryParser
似乎将"AND (NOT"
计算为同一类型的对象树。
这是 Lucene 中的错误还是我误解了 Lucene 的查询计算?我很高兴如有必要,提供更多信息。
I am adding Apache Lucene support to Querydsl (which offers type-safe queries for Java) and I am having problems understanding how Lucene evaluates queries especially regarding negation in nested queries.
For instance the following two queries in my opinion are semantically the same, but only the first one returns results.
+year:1990 -title:"Jurassic Park"
+year:1990 +(-title:"Jurassic Park")
The simplified object tree in the second example is shown below.
query : Query
clauses : ArrayList
[0] : BooleanClause
"MUST" occur : BooleanClause.Occur
"year:1990" query : TermQuery
[1] : BooleanClause
"MUST" occur : BooleanClause.Occur
query : BooleanQuery
clauses : ArrayList
[0] : BooleanClause
"MUST_NOT" occur : BooleanClause.Occur
"title:"Jurassic Park"" query : TermQuery
Lucene's own QueryParser
seems to evaluate "AND (NOT"
into the same kind of object trees.
Is this a bug in Lucene or have I misunderstood Lucene's query evaluation? I am happy to give more information if necessary.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
它们在语义上并不相同。
您
有一个只有一个 NOT 子句的子查询。发生的情况是 Lucene 正在评估该
子句并且返回 0 个文档。然后您指示子查询必须发生,并且由于它返回零文档,因此它否定了查询的其余部分。
They are not semantically the same.
In
You have a subquery that only has one NOT clause. What's happening is that Lucene is evaluating the
clause and it's returning 0 documents. Then you're indicating that the subquery MUST occur, and since it's return zero documents, it negates the rest of the query.