在 Solr 中使用短语进行邻近搜索
我经常使用 Solr 的邻近搜索来搜索彼此指定范围内的单词,所以
"Government Spending" ~2
我想知道是否有一种方法可以使用一个短语和一个单词或两个短语来执行邻近搜索。这可能吗?如果是这样,语法是什么?
I use Solr's proximity search quite often to search for words within a specified range of each other, like so
"Government Spending" ~2
I was wondering is there a way to perform a proximity search using a phrase and a word or two phrases. Is this possible? If so what is the syntax?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
这似乎“在某种程度上”是可行的。考虑以下文本:
"更多流量之间 solr" ~2
“更多关于 solr 之间的信息”~2
即使您更改顺序它也可以工作:
"more about solr Between" ~2" ~2
但相隔太远,它停止工作:
"更多关于服务器本身的信息" ~2
我想如果这样的话不起作用,制作一个执行此操作的自定义请求处理程序可能不会太难,我认为您可能需要定义一个新语法,例如
("phrase one" "phrase Two") ~2 。我猜想,如果您正在打包,并且您创建了一个 Lucene 查询,其中只有“短语一”的标记和另一个具有一定接近度的“短语二”的标记,我认为它会起作用。 (当然,您需要实际进行 lucene java 调用,您不能只是将查询移交给(阅读此
This appears to be "somewhat" doable. Consider this text:
"more traffic between solr" ~2
"more about between solr" ~2
Even if you change the order it works:
"more about solr between" ~2" ~2
But too far apart and it stops working:
"more about servers themselves" ~2
I think if that doesn't work, it would probably not be TOO hard to make a custom request handler that does this. I think you might need to define a new syntax, prehaps something like
("phrase one" "phrase two") ~2
. I would guess that if you are shingling, and you create a Lucene query where there is a token of just "phrase one" and another of "phrase two" that have a certain proximity, i think it will work. (of course you will need to actually make the lucene java call, you can't just hand the query over (read this http://lucene.apache.org/java/2_2_0/api/index.html)).我发现了一种开箱即用的方法,可以使用多个单词或短语执行 Solr 邻近搜索,请参见下文,
例如。有 3 个单词:
“(word1) (word2) (word3)”~10
例如。有 2 个短语:(注意双引号需要转义)
"(\"phrase1\") (\"phrase2\")"~10
Out of the box I have discovered a way to perform a Solr proximity search using more then one word, or phrases, see below
eg. with 3 words:
"(word1) (word2) (word3)"~10
eg. with 2 phrases: (note the double quote needs to be escaped)
"(\"phrase1\") (\"phrase2\")"~10
从 Solr 4 开始,可以使用 SurroundQueryParser。
例如,要查询“短语二”跟在“短语一”之后不超过 3 个单词的位置:
要查询“短语一”的 5 个单词附近的“短语二”:
Since Solr 4 it is possible with SurroundQueryParser.
E.g. to query where "phrase two" follows "phrase one" not further than 3 words after:
To query "phrase two" in proximity of 5 words of "phrase one":