Lucene QueryParser 解释“AND OR”;作为命令?
我使用以下代码(准确地说是 PyLucene)调用 Lucene:
analyzer = StandardAnalyzer(Version.LUCENE_30)
queryparser = QueryParser(Version.LUCENE_30, "text", analyzer)
query = queryparser.parse(queryparser.escape(querytext))
但请考虑这是否是 querytext
的内容:
querytext = "THE FOOD WAS HONESTLY NOT WORTH THE PRICE. MUCH TOO PRICY WOULD NOT GO BACK AND OR RECOMMEND IT"
在这种情况下,“AND OR”会使查询解析器出错,即使我我使用queryparser.escape
。如何避免出现以下错误消息?
Java stacktrace:
org.apache.lucene.queryParser.ParseException: Cannot parse 'THE FOOD WAS HONESTLY NOT WORTH THE PRICE. MUCH TOO PRICY WOULD NOT GO BACK AND OR RECOMMEND IT': Encountered " <OR> "OR "" at line 1, column 80.
Was expecting one of:
<NOT> ...
"+" ...
"-" ...
"(" ...
"*" ...
<QUOTED> ...
<TERM> ...
<PREFIXTERM> ...
<WILDTERM> ...
"[" ...
"{" ...
<NUMBER> ...
<TERM> ...
"*" ...
at org.apache.lucene.queryParser.QueryParser.parse(QueryParser.java:187)
....
at org.apache.lucene.queryParser.QueryParser.generateParseException(QueryParser.java:1759)
at org.apache.lucene.queryParser.QueryParser.jj_consume_token(QueryParser.java:1641)
at org.apache.lucene.queryParser.QueryParser.Clause(QueryParser.java:1268)
at org.apache.lucene.queryParser.QueryParser.Query(QueryParser.java:1207)
at org.apache.lucene.queryParser.QueryParser.TopLevelQuery(QueryParser.java:1167)
at org.apache.lucene.queryParser.QueryParser.parse(QueryParser.java:182)
I am calling Lucene using the following code (PyLucene, to be precise):
analyzer = StandardAnalyzer(Version.LUCENE_30)
queryparser = QueryParser(Version.LUCENE_30, "text", analyzer)
query = queryparser.parse(queryparser.escape(querytext))
But consider if this is the content of querytext
:
querytext = "THE FOOD WAS HONESTLY NOT WORTH THE PRICE. MUCH TOO PRICY WOULD NOT GO BACK AND OR RECOMMEND IT"
In that case, the "AND OR" trips up the queryparser, even though I am use queryparser.escape
. How do I avoid the following error message?
Java stacktrace:
org.apache.lucene.queryParser.ParseException: Cannot parse 'THE FOOD WAS HONESTLY NOT WORTH THE PRICE. MUCH TOO PRICY WOULD NOT GO BACK AND OR RECOMMEND IT': Encountered " <OR> "OR "" at line 1, column 80.
Was expecting one of:
<NOT> ...
"+" ...
"-" ...
"(" ...
"*" ...
<QUOTED> ...
<TERM> ...
<PREFIXTERM> ...
<WILDTERM> ...
"[" ...
"{" ...
<NUMBER> ...
<TERM> ...
"*" ...
at org.apache.lucene.queryParser.QueryParser.parse(QueryParser.java:187)
....
at org.apache.lucene.queryParser.QueryParser.generateParseException(QueryParser.java:1759)
at org.apache.lucene.queryParser.QueryParser.jj_consume_token(QueryParser.java:1641)
at org.apache.lucene.queryParser.QueryParser.Clause(QueryParser.java:1268)
at org.apache.lucene.queryParser.QueryParser.Query(QueryParser.java:1207)
at org.apache.lucene.queryParser.QueryParser.TopLevelQuery(QueryParser.java:1167)
at org.apache.lucene.queryParser.QueryParser.parse(QueryParser.java:182)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
这不仅仅是
OR
,而是AND OR
。我使用以下解决方法:
It's not just
OR
, it'sAND OR
.I use the following workaround:
queryparser.parse 仅转义特殊字符(如此页面< /a>) 并保留“AND OR”不变,因此它在您的情况下不起作用。由于您可能还使用 StandardAnalyzer 来分析文本,因此索引中的术语已经是小写的。因此,您可以在将整个查询字符串提供给查询解析器之前将其更改为小写。小写的“and”和“or”不被视为运算符,因此“and or”不会使查询解析器出错。
queryparser.parse only escapes special characters (as shown in this page) and leaves "AND OR" unchanged, so it would not work in your case. Since presumably you also used StandardAnalyzer to analyze your text, the terms in your index are already in lowercase. So you can change the whole query string to lowercase before giving it to the queryparser. Lowercase "and" and "or" are not considered operators, so "and or" would not trip the queryparser.
我意识到我来得太晚了,但在搜索字符串周围加上引号是一个更好的选择:
I realise I'm rather late to the party here, but putting quotes round the search string is a better option: