Lucene关于否定的嵌套查询评估

发布于 2024-08-25 18:48:29 字数 1037 浏览 8 评论 0原文

我将 Apache Lucene 支持添加到 Querydsl (为 Java 提供类型安全查询),我遇到了问题了解 Lucene 如何评估查询,特别是关于嵌套查询中的否定。

例如,我认为以下两个查询在语义上是相同的,但只有第一个查询返回结果。

+year:1990 -title:"Jurassic Park"
+year:1990 +(-title:"Jurassic Park")

第二个示例中的简化对象树如下所示。

query : Query
  clauses : ArrayList
    [0] : BooleanClause
      "MUST" occur : BooleanClause.Occur
      "year:1990" query : TermQuery
    [1] : BooleanClause
      "MUST" occur : BooleanClause.Occur
      query : BooleanQuery
        clauses : ArrayList
          [0] : BooleanClause
            "MUST_NOT" occur : BooleanClause.Occur
            "title:"Jurassic Park"" query : TermQuery

Lucene 自己的QueryParser 似乎将"AND (NOT" 计算为同一类型的对象树。

这是 Lucene 中的错误还是我误解了 Lucene 的查询计算?我很高兴如有必要,提供更多信息。

I am adding Apache Lucene support to Querydsl (which offers type-safe queries for Java) and I am having problems understanding how Lucene evaluates queries especially regarding negation in nested queries.

For instance the following two queries in my opinion are semantically the same, but only the first one returns results.

+year:1990 -title:"Jurassic Park"
+year:1990 +(-title:"Jurassic Park")

The simplified object tree in the second example is shown below.

query : Query
  clauses : ArrayList
    [0] : BooleanClause
      "MUST" occur : BooleanClause.Occur
      "year:1990" query : TermQuery
    [1] : BooleanClause
      "MUST" occur : BooleanClause.Occur
      query : BooleanQuery
        clauses : ArrayList
          [0] : BooleanClause
            "MUST_NOT" occur : BooleanClause.Occur
            "title:"Jurassic Park"" query : TermQuery

Lucene's own QueryParser seems to evaluate "AND (NOT" into the same kind of object trees.

Is this a bug in Lucene or have I misunderstood Lucene's query evaluation? I am happy to give more information if necessary.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

强者自强 2024-09-01 18:48:29

它们在语义上并不相同。

+year:1990 +(-title:"Jurassic Park")

有一个只有一个 NOT 子句的子查询。发生的情况是 Lucene 正在评估该

-title:"Jurassic Park"

子句并且返回 0 个文档。然后您指示子查询必须发生,并且由于它返回零文档,因此它否定了查询的其余部分。

They are not semantically the same.

In

+year:1990 +(-title:"Jurassic Park")

You have a subquery that only has one NOT clause. What's happening is that Lucene is evaluating the

-title:"Jurassic Park"

clause and it's returning 0 documents. Then you're indicating that the subquery MUST occur, and since it's return zero documents, it negates the rest of the query.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文