Solr 使用意外的前缀和后缀突出显示
我需要像这样自定义 Solr 高亮前缀和后缀:
<span class="highlight">text</span>
而不是默认值
<em>text</em>
这就是为什么我在 solrconfig.xml
中为 HighlightComponent
使用此配置:
<searchComponent class="solr.HighlightComponent" name="highlight">
<highlighting>
<fragmentsBuilder name="simple" default="true" class="solr.highlight.SimpleFragmentsBuilder">
<lst name="defaults">
<str name="hl.tag.pre"><![CDATA[<span class="highlight">]]></str>
<str name="hl.tag.post"><![CDATA[</span>]]></str>
</lst>
</fragmentsBuilder>
</highlighting>
</searchComponent>
以下是我的标准请求处理程序的默认参数:
<requestHandler name="standard" class="solr.SearchHandler" default="true">
<lst name="defaults">
<str name="hl">true</str>
<str name="hl.fl">body,title</str>
<str name="hl.useFastVectorHighlighter">true</str>
</lst>
</requestHandler>
当我搜索 text
单词时,我确实会突出显示文本单词,但并不总是使用我配置的前缀和后缀:
<lst name="highlighting">
<lst name="document_1">
<arr name="body">
<str>my <em>text</em> highlighted</str>
</arr>
<arr name="title">
<str>my <span class="highlight">text</span> highlighted</str>
</arr>
</lst>
</lst>
有人知道为什么吗?
I need to customize Solr highlighting prefix and suffix like this:
<span class="highlight">text</span>
instead of the default
<em>text</em>
That's why I'm using this configuration within the solrconfig.xml
for the HighlightComponent
:
<searchComponent class="solr.HighlightComponent" name="highlight">
<highlighting>
<fragmentsBuilder name="simple" default="true" class="solr.highlight.SimpleFragmentsBuilder">
<lst name="defaults">
<str name="hl.tag.pre"><![CDATA[<span class="highlight">]]></str>
<str name="hl.tag.post"><![CDATA[</span>]]></str>
</lst>
</fragmentsBuilder>
</highlighting>
</searchComponent>
The following are the default parameters for my standard request handler:
<requestHandler name="standard" class="solr.SearchHandler" default="true">
<lst name="defaults">
<str name="hl">true</str>
<str name="hl.fl">body,title</str>
<str name="hl.useFastVectorHighlighter">true</str>
</lst>
</requestHandler>
When I search for the text
word I do get the text word highlighted, but not always using the prefix and suffix I configured:
<lst name="highlighting">
<lst name="document_1">
<arr name="body">
<str>my <em>text</em> highlighted</str>
</arr>
<arr name="title">
<str>my <span class="highlight">text</span> highlighted</str>
</arr>
</lst>
</lst>
Does anybody know why?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
我猜您看到这种行为是因为您只为 SimpleFragmentsBuilder 定义了前缀和后缀,而其他亮点来自另一个片段构建器。
我使用自定义前缀和后缀进行突出显示,并在 solrconfig.xml 的
highlighting
部分的formatter
部分中设置此值,并且没有遇到任何问题它将适用于所有片段构建器。所以也许可以尝试以下方法:
I am guessing you are seeing this behavior behavior because you only have the prefix and suffix defined for the SimpleFragmentsBuilder and the other highlights are coming from another fragment builder.
I am using a custom prefix and suffix for my highlighting and I set this value in the
formatter
section of thehighlighting
section of the solrconfig.xml and have not had any issues as it will apply to all fragment builders.So maybe try the following:
我终于知道为什么了!我正在使用 fastVectorHighlighter 来加快突出显示速度。
一开始我只突出显示
title
字段,一切正常。当我添加
body
字段以突出显示时,我忘记启用termVectors =true
。现在,
在完全重新索引突出显示完美工作后,我的
body
字段看起来像这样:以前,body 字段突出显示确实有效,但没有
fastVectorHighlighter
因为该字段没有termVectors=true
参数。这就是为什么我用默认前缀和后缀突出显示body
。由于fastVectorHighlighter
是一种完全不同的突出显示方法,因此配置也不同。为了避免这种错误,只要用户可以使用 hl.fl 参数选择要突出显示的字段,我建议还包含标准突出显示的配置(formatter 元素,类 < code>solr.highlight.HtmlFormatter) 像这样:
这种方式突出显示将使用相同的前缀和后缀,即使对于禁用
termVectors
的字段也是如此。I finally found out why! I'm using fastVectorHighlighter to make highlighting faster.
At the beginning I was highlighting only the
title
field and everything worked fine.When I added the
body
field to highlighting I forgot to enabletermVectors=true
.Now that my
body
field looks like thisafter a full reindex highlighting is working perfectly:
Previously the body field highlighting did work, but without
fastVectorHighlighter
since the field didn't have thetermVectors=true
parameter. That's why I gotbody
highlighted with default prefix and suffix. SincefastVectorHighlighter
is a completely different highlighting method, the configuration is different as well.To avoid this kind of mistakes, as long the users can choose what fields to highlight with the
hl.fl parameter
, I'd recommend to include also the configuration for the standard highlighting (formatter element, classsolr.highlight.HtmlFormatter
) like this:This way highlighting will work with the same prefix and suffix even for fields with
termVectors
disabled.