Lucene QueryParser 解释“AND OR”;作为命令?

发布于 2024-09-13 13:19:47 字数 1595 浏览 4 评论 0原文

我使用以下代码(准确地说是 PyLucene)调用 Lucene:

analyzer = StandardAnalyzer(Version.LUCENE_30)
queryparser = QueryParser(Version.LUCENE_30, "text", analyzer)
query = queryparser.parse(queryparser.escape(querytext))

但请考虑这是否是 querytext 的内容:

querytext = "THE FOOD WAS HONESTLY NOT WORTH THE PRICE. MUCH TOO PRICY WOULD NOT GO BACK AND OR RECOMMEND IT"

在这种情况下,“AND OR”会使查询解析器出错,即使我我使用queryparser.escape。如何避免出现以下错误消息?

    Java stacktrace:
org.apache.lucene.queryParser.ParseException: Cannot parse 'THE FOOD WAS HONESTLY NOT WORTH THE PRICE. MUCH TOO PRICY WOULD NOT GO BACK AND OR RECOMMEND IT': Encountered " <OR> "OR "" at line 1, column 80.
Was expecting one of:
    <NOT> ...
    "+" ...
    "-" ...
    "(" ...
    "*" ...
    <QUOTED> ...
    <TERM> ...
    <PREFIXTERM> ...
    <WILDTERM> ...
    "[" ...
    "{" ...
    <NUMBER> ...
    <TERM> ...
    "*" ...

 at org.apache.lucene.queryParser.QueryParser.parse(QueryParser.java:187)
     ....
 at org.apache.lucene.queryParser.QueryParser.generateParseException(QueryParser.java:1759)
 at org.apache.lucene.queryParser.QueryParser.jj_consume_token(QueryParser.java:1641)
 at org.apache.lucene.queryParser.QueryParser.Clause(QueryParser.java:1268)
 at org.apache.lucene.queryParser.QueryParser.Query(QueryParser.java:1207)
 at org.apache.lucene.queryParser.QueryParser.TopLevelQuery(QueryParser.java:1167)
 at org.apache.lucene.queryParser.QueryParser.parse(QueryParser.java:182)

I am calling Lucene using the following code (PyLucene, to be precise):

analyzer = StandardAnalyzer(Version.LUCENE_30)
queryparser = QueryParser(Version.LUCENE_30, "text", analyzer)
query = queryparser.parse(queryparser.escape(querytext))

But consider if this is the content of querytext:

querytext = "THE FOOD WAS HONESTLY NOT WORTH THE PRICE. MUCH TOO PRICY WOULD NOT GO BACK AND OR RECOMMEND IT"

In that case, the "AND OR" trips up the queryparser, even though I am use queryparser.escape. How do I avoid the following error message?

    Java stacktrace:
org.apache.lucene.queryParser.ParseException: Cannot parse 'THE FOOD WAS HONESTLY NOT WORTH THE PRICE. MUCH TOO PRICY WOULD NOT GO BACK AND OR RECOMMEND IT': Encountered " <OR> "OR "" at line 1, column 80.
Was expecting one of:
    <NOT> ...
    "+" ...
    "-" ...
    "(" ...
    "*" ...
    <QUOTED> ...
    <TERM> ...
    <PREFIXTERM> ...
    <WILDTERM> ...
    "[" ...
    "{" ...
    <NUMBER> ...
    <TERM> ...
    "*" ...

 at org.apache.lucene.queryParser.QueryParser.parse(QueryParser.java:187)
     ....
 at org.apache.lucene.queryParser.QueryParser.generateParseException(QueryParser.java:1759)
 at org.apache.lucene.queryParser.QueryParser.jj_consume_token(QueryParser.java:1641)
 at org.apache.lucene.queryParser.QueryParser.Clause(QueryParser.java:1268)
 at org.apache.lucene.queryParser.QueryParser.Query(QueryParser.java:1207)
 at org.apache.lucene.queryParser.QueryParser.TopLevelQuery(QueryParser.java:1167)
 at org.apache.lucene.queryParser.QueryParser.parse(QueryParser.java:182)

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

一片旧的回忆 2024-09-20 13:19:47

这不仅仅是OR,而是AND OR

我使用以下解决方法:

query = queryparser.parse(queryparser.escape(querytext.replace("AND OR", "AND or")))

It's not just OR, it's AND OR.

I use the following workaround:

query = queryparser.parse(queryparser.escape(querytext.replace("AND OR", "AND or")))
各空 2024-09-20 13:19:47

queryparser.parse only escapes special characters (as shown in this page) and leaves "AND OR" unchanged, so it would not work in your case. Since presumably you also used StandardAnalyzer to analyze your text, the terms in your index are already in lowercase. So you can change the whole query string to lowercase before giving it to the queryparser. Lowercase "and" and "or" are not considered operators, so "and or" would not trip the queryparser.

纵性 2024-09-20 13:19:47

我意识到我来得太晚了,但在搜索字符串周围加上引号是一个更好的选择:

querytext = "\"THE FOOD WAS ... \""

I realise I'm rather late to the party here, but putting quotes round the search string is a better option:

querytext = "\"THE FOOD WAS ... \""
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文