Solr 中不区分大小写的拼写检查

发布于 2024-12-01 12:47:41 字数 1213 浏览 4 评论 0原文

我们如何让 solr 中的拼写检查器忽略大小写？对于查询：“Lether”，我得到的建议是“leather”，这是正确的。但是如果查询是“lether”，我会得到一些不同的建议，例如“lethel”，这是不正确的。

我已将我的配置复制到此处以供参考：

<fieldType name="text_spell" class="solr.TextField" positionIncrementGap="100" autoGeneratePhraseQueries="true">
  <analyzer type="index">
     <tokenizer class="solr.StandardTokenizerFactory"/>
     <filter class="solr.LowerCaseFilterFactory"/>
     <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
   </analyzer>
   <analyzer type="query">
     <tokenizer class="solr.StandardTokenizerFactory"/>
     <filter class="solr.LowerCaseFilterFactory"/>
     <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
   </analyzer>
</fieldType>

<field name="spelltext" type="text_spell" indexed="true" stored="false" multiValued="true"/>
<field name="title" type="text" indexed="true" stored="true" multiValued="false" omitNorms="true"/>
<copyField source="title" dest="spelltext" />

是否有任何明显的我遗漏的东西？

原文

How can we make spellchecker in solr to ignore case? For the query : "Lether", I get suggestion "leather" which is right. But if the query is "lether", I get some different suggestion like "lethel" which is not correct.

I tried the configuration as mentioned in this post, but it doesn't seem to be working.

I have copied my configuration here for the reference:

<fieldType name="text_spell" class="solr.TextField" positionIncrementGap="100" autoGeneratePhraseQueries="true">
  <analyzer type="index">
     <tokenizer class="solr.StandardTokenizerFactory"/>
     <filter class="solr.LowerCaseFilterFactory"/>
     <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
   </analyzer>
   <analyzer type="query">
     <tokenizer class="solr.StandardTokenizerFactory"/>
     <filter class="solr.LowerCaseFilterFactory"/>
     <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
   </analyzer>
</fieldType>

<field name="spelltext" type="text_spell" indexed="true" stored="false" multiValued="true"/>
<field name="title" type="text" indexed="true" stored="true" multiValued="false" omitNorms="true"/>
<copyField source="title" dest="spelltext" />

Is there any obvious thing that I am missing?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

清旖 2024-12-08 12:47:41

在这种情况下，您还没有在 spellchecker 组件中指定使用小写过滤器的 fieldType，例如，

<lst name="spellchecker">
  <str name="name">spell</str>
  <str name="field">text_spell</str>
  <str name="spellcheckIndexDir">spell</str>
  <str name="buildOnOptimize">true</str>
</lst>

其次，还要注意 buildOnOptimize code> 它会在 optimize 命令上重建拼写检查索引。

In cse you haven't already, in your spellchecker component you must specify a fieldType using a lower case filter, e.g.,

<lst name="spellchecker">
  <str name="name">spell</str>
  <str name="field">text_spell</str>
  <str name="spellcheckIndexDir">spell</str>
  <str name="buildOnOptimize">true</str>
</lst>

Secondly, also notice the buildOnOptimize which rebuilds your spellcheck index on the optimize command.

回复收藏 0 原文

~没有更多了~