Solr edismax通配符搜索找不到原始字符串
我的 Solr 索引中有下一个内容: text_en
类型字段中的 西印度樱桃
(有关字段定义,请参见下文)。
当我使用 cherr*
搜索时,找到匹配项。
同时搜索 cherri*
与文档中的单词匹配。
但搜索 cherry*
确实不匹配。
我怀疑 PorterStemFilterFactory
对此,但我不这样做不明白为什么(查询分析器与索引分析器相同)。
示例查询
/solr/select?defType=edismax&q=cherry*
solrconfig.xml
...
<fieldType name="text_en" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.EnglishPossessiveFilterFactory"/>
<filter class="solr.PorterStemFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.EnglishPossessiveFilterFactory"/>
<filter class="solr.PorterStemFilterFactory"/>
</analyzer>
</fieldType>
...
字段分析
索引
org.apache.solr.analysis.StandardTokenizerFactory: cherry
org.apache.solr.analysis.LowerCaseFilterFactory: cherry
org.apache.solr.analysis.EnglishPossessiveFilterFactory: cherry
org.apache.solr.analysis.PorterStemFilterFactory: cherri <-- note the change from cherry to cherri
查询
org.apache.solr.analysis.StandardTokenizerFactory: cherry
org.apache.solr.analysis.LowerCaseFilterFactory: cherry
org.apache.solr.analysis.EnglishPossessiveFilterFactory: cherry
org.apache.solr.analysis.PorterStemFilterFactory: cherri
I have next content in my Solr index:west indian cherry
in filed of type text_en
(see below for field definition).
When I search with cherr*
match is found.
Also search for cherri*
matches word in document.
But search for cherry*
does not match.
I suspect PorterStemFilterFactory
for this, but I don't understand why (query analyzer is same as index analyzer).
sample query
/solr/select?defType=edismax&q=cherry*
solrconfig.xml
...
<fieldType name="text_en" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.EnglishPossessiveFilterFactory"/>
<filter class="solr.PorterStemFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.EnglishPossessiveFilterFactory"/>
<filter class="solr.PorterStemFilterFactory"/>
</analyzer>
</fieldType>
...
field analysis
index
org.apache.solr.analysis.StandardTokenizerFactory: cherry
org.apache.solr.analysis.LowerCaseFilterFactory: cherry
org.apache.solr.analysis.EnglishPossessiveFilterFactory: cherry
org.apache.solr.analysis.PorterStemFilterFactory: cherri <-- note the change from cherry to cherri
query
org.apache.solr.analysis.StandardTokenizerFactory: cherry
org.apache.solr.analysis.LowerCaseFilterFactory: cherry
org.apache.solr.analysis.EnglishPossessiveFilterFactory: cherry
org.apache.solr.analysis.PorterStemFilterFactory: cherri
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#Analyzers 提及 -
因此搜索查询在查询期间不会进行任何分析。
因此,索引的术语将不同于正在搜索的术语。
由于索引词是
cherri
,因此搜索cherry*
将不会匹配任何文档。http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#Analyzers mentions -
So the search query will not undergo any analysis during query time.
Hence the terms indexed would be different from the ones being search upon.
As the indexed term is
cherri
, the search forcherry*
would not match any documents.