Solr - 突出显示查询短语
是否可以突出显示整个查询词? fe 当我询问“美国”时,我想要得到:
<em>United States</em>
而不是:
<em>United</em> <em>States</em>
我已经在整个互联网上搜索了答案,使用了hl.mergeContigously的所有组合, hl.usePhrasesHighlighter 和 hl.highlightMultiTerm 参数仍然无法使其工作。
我的查询是:
http://localhost:8983/solandra/idxPosts.proj350_139/select?q=post_text:"Janusz Palikot"&hl=true&hl.fl=post_text&hl.mergeContiguous=true&hl.usePhrasesHighlighter=true&hl.highlightMultiTerm=true
答案是:
...
<arr name="post_text"><str>Tag: <em>janusz</em> <em>palikot</em> - Sowiniec: "Sowiniec"</str></arr>
...
我的“post_text”字段是:
<field name="post_text" type="text" indexed="true" stored="true" termVectors="true" termPositions="true" termOffsets="true" required="true" />
我的“文本”类型是:
<fieldType name="text" class="solr.TextField">
<analyzer type="index">
<tokenizer class="solr.StandardTokenizerFactory" />
<filter class="solr.TrimFilterFactory" />
<filter class="solr.LowerCaseFilterFactory" />
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords_pl.txt" />
<filter class="solr.ReversedWildcardFilterFactory" />
</analyzer>
<analyzer type="query">
<tokenizer class="solr.StandardTokenizerFactory" />
<filter class="solr.TrimFilterFactory" />
<filter class="solr.LowerCaseFilterFactory" />
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords_pl.txt" />
</analyzer>
</fieldType>
我也尝试将 FastVectorHighlighter 与 hl.useFastVectorHighlighter=true 一起使用,但遇到了错误:
Problem accessing /solandra/idxPosts.proj350_139/select. Reason:
-6
java.lang.ArrayIndexOutOfBoundsException: -6
at lucandra.TermFreqVector.getOffsets(TermFreqVector.java:224)
at org.apache.lucene.search.vectorhighlight.FieldTermStack.<init>(FieldTermStack.java:100)
at org.apache.lucene.search.vectorhighlight.FastVectorHighlighter.getFieldFragList(FastVectorHighlighter.java:175)
at org.apache.lucene.search.vectorhighlight.FastVectorHighlighter.getBestFragments(FastVectorHighlighter.java:166)
at org.apache.solr.highlight.DefaultSolrHighlighter.doHighlightingByFastVectorHighlighter(DefaultSolrHighlighter.java:509)
at org.apache.solr.highlight.DefaultSolrHighlighter.doHighlighting(DefaultSolrHighlighter.java:376)
at org.apache.solr.handler.component.HighlightComponent.process(HighlightComponent.java:116)
at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:194)
...
你能帮我吗?
Is it possible to highlight whole query terms? f.e. when I ask for "United States" I want to get:
<em>United States</em>
and not:
<em>United</em> <em>States</em>
I've searched the whole Internet for an answer, used all combinations of hl.mergeContiguous, hl.usePhrasesHighlighter and hl.highlightMultiTerm parameters and still cannot make it work.
my query is:
http://localhost:8983/solandra/idxPosts.proj350_139/select?q=post_text:"Janusz Palikot"&hl=true&hl.fl=post_text&hl.mergeContiguous=true&hl.usePhrasesHighlighter=true&hl.highlightMultiTerm=true
the answer is:
...
<arr name="post_text"><str>Tag: <em>janusz</em> <em>palikot</em> - Sowiniec: "Sowiniec"</str></arr>
...
my "post_text" field is:
<field name="post_text" type="text" indexed="true" stored="true" termVectors="true" termPositions="true" termOffsets="true" required="true" />
my "text" type is:
<fieldType name="text" class="solr.TextField">
<analyzer type="index">
<tokenizer class="solr.StandardTokenizerFactory" />
<filter class="solr.TrimFilterFactory" />
<filter class="solr.LowerCaseFilterFactory" />
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords_pl.txt" />
<filter class="solr.ReversedWildcardFilterFactory" />
</analyzer>
<analyzer type="query">
<tokenizer class="solr.StandardTokenizerFactory" />
<filter class="solr.TrimFilterFactory" />
<filter class="solr.LowerCaseFilterFactory" />
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords_pl.txt" />
</analyzer>
</fieldType>
I also tried to use FastVectorHighlighter with hl.useFastVectorHighlighter=true but encountered an error:
Problem accessing /solandra/idxPosts.proj350_139/select. Reason:
-6
java.lang.ArrayIndexOutOfBoundsException: -6
at lucandra.TermFreqVector.getOffsets(TermFreqVector.java:224)
at org.apache.lucene.search.vectorhighlight.FieldTermStack.<init>(FieldTermStack.java:100)
at org.apache.lucene.search.vectorhighlight.FastVectorHighlighter.getFieldFragList(FastVectorHighlighter.java:175)
at org.apache.lucene.search.vectorhighlight.FastVectorHighlighter.getBestFragments(FastVectorHighlighter.java:166)
at org.apache.solr.highlight.DefaultSolrHighlighter.doHighlightingByFastVectorHighlighter(DefaultSolrHighlighter.java:509)
at org.apache.solr.highlight.DefaultSolrHighlighter.doHighlighting(DefaultSolrHighlighter.java:376)
at org.apache.solr.handler.component.HighlightComponent.process(HighlightComponent.java:116)
at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:194)
...
Can you help me, please?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
对于短语突出显示,有一个 Jira 仍在等待连接到 Solr代码。
For the phrase highlight, there is a Jira stilling waiting to get through to the Solr code.
检查 solr doc ,有参数 hl,将其设置为 true。
Chech the solr doc for this, there is parameter hl, set this true.