如何实现 solr 拼写检查器?

发布于 2024-11-16 01:34:00 字数 46 浏览 3 评论 0原文

我想使用 solr 在我的搜索应用程序中实现拼写检查器组件。需要更改什么配置?

I want to implement a spellchecker component in my search application using solr. What configuration is required to change for it?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

听风念你 2024-11-23 01:34:00

将以下部分添加到 solrconfig.xml

    <searchComponent name="spellcheck" class="solr.SpellCheckComponent">

    <lst name="spellchecker">
      <!--
           Optional, it is required when more than one spellchecker is configured.
           Select non-default name with spellcheck.dictionary in request handler.
      -->
      <str name="name">default</str>
      <!-- The classname is optional, defaults to IndexBasedSpellChecker -->
      <str name="classname">solr.IndexBasedSpellChecker</str>
      <!--
               Load tokens from the following field for spell checking,
               analyzer for the field's type as defined in schema.xml are used
      -->
      <str name="field">spell</str>
      <!-- Optional, by default use in-memory index (RAMDirectory) -->
      <str name="spellcheckIndexDir">./spellchecker</str>
      <!-- Set the accuracy (float) to be used for the suggestions. Default is 0.5 -->
      <str name="accuracy">0.7</str>
      <!-- Require terms to occur in 1/100th of 1% of documents in order to be included in the dictionary -->
      <float name="thresholdTokenFrequency">.0001</float>
    </lst>
    <!-- Example of using different distance measure -->
    <lst name="spellchecker">
      <str name="name">jarowinkler</str>
      <str name="field">lowerfilt</str>
      <!-- Use a different Distance Measure -->
      <str name="distanceMeasure">org.apache.lucene.search.spell.JaroWinklerDistance</str>
      <str name="spellcheckIndexDir">./spellchecker</str>

    </lst>

    <!-- This field type's analyzer is used by the QueryConverter to tokenize the value for "q" parameter -->
    <str name="queryAnalyzerFieldType">textSpell</str>
</searchComponent>
<!--
  The SpellingQueryConverter to convert raw (CommonParams.Q) queries into tokens.  Uses a simple regular expression
  to strip off field markup, boosts, ranges, etc. but it is not guaranteed to match an exact parse from the query parser.

  Optional, defaults to solr.SpellingQueryConverter
-->
<queryConverter name="queryConverter" class="solr.SpellingQueryConverter"/>

<!--  Add to a RequestHandler
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
NOTE:  YOU LIKELY DO NOT WANT A SEPARATE REQUEST HANDLER FOR THIS COMPONENT.  THIS IS DONE HERE SOLELY FOR
THE SIMPLICITY OF THE EXAMPLE.  YOU WILL LIKELY WANT TO BIND THE COMPONENT TO THE /select STANDARD REQUEST HANDLER.
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
-->


<requestHandler name="/spellCheckCompRH" class="solr.SearchHandler">
    <lst name="defaults">
      <!-- Optional, must match spell checker's name as defined above, defaults to "default" -->
      <str name="spellcheck.dictionary">default</str>
      <!-- omp = Only More Popular -->
      <str name="spellcheck.onlyMorePopular">false</str>
      <!-- exr = Extended Results -->
      <str name="spellcheck.extendedResults">false</str>
      <!--  The number of suggestions to return -->
      <str name="spellcheck.count">1</str>
    </lst>
<!--  Add to a RequestHandler
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
REPEAT NOTE:  YOU LIKELY DO NOT WANT A SEPARATE REQUEST HANDLER FOR THIS COMPONENT.  THIS IS DONE HERE SOLELY FOR
THE SIMPLICITY OF THE EXAMPLE.  YOU WILL LIKELY WANT TO BIND THE COMPONENT TO THE /select STANDARD REQUEST HANDLER.
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
-->
    <arr name="last-components">
      <str>spellcheck</str>
    </arr>
  </requestHandler>

此配置示例来自 Solr Wiki,
添加此内容后,您可以请求构建拼写检查器索引,

http://localhost:8983/solr/spell?q=some query&spellcheck=true&spellcheck.collate=true&spellcheck.build=true 

请注意不要在每个请求中包含查询的最后部分,因为这将在您请求时始终构建拼写索引,因此
前一个变成在第一个请求之后

http://localhost:8983/solr/spell?q=some query&spellcheck=true&spellcheck.collate=true

在前面的 XML 文本中不要忘记将字段拼写替换为您想要构建拼写检查器的字段

现在您可以感受到拼写检查的力量

Add the following section to your solrconfig.xml

    <searchComponent name="spellcheck" class="solr.SpellCheckComponent">

    <lst name="spellchecker">
      <!--
           Optional, it is required when more than one spellchecker is configured.
           Select non-default name with spellcheck.dictionary in request handler.
      -->
      <str name="name">default</str>
      <!-- The classname is optional, defaults to IndexBasedSpellChecker -->
      <str name="classname">solr.IndexBasedSpellChecker</str>
      <!--
               Load tokens from the following field for spell checking,
               analyzer for the field's type as defined in schema.xml are used
      -->
      <str name="field">spell</str>
      <!-- Optional, by default use in-memory index (RAMDirectory) -->
      <str name="spellcheckIndexDir">./spellchecker</str>
      <!-- Set the accuracy (float) to be used for the suggestions. Default is 0.5 -->
      <str name="accuracy">0.7</str>
      <!-- Require terms to occur in 1/100th of 1% of documents in order to be included in the dictionary -->
      <float name="thresholdTokenFrequency">.0001</float>
    </lst>
    <!-- Example of using different distance measure -->
    <lst name="spellchecker">
      <str name="name">jarowinkler</str>
      <str name="field">lowerfilt</str>
      <!-- Use a different Distance Measure -->
      <str name="distanceMeasure">org.apache.lucene.search.spell.JaroWinklerDistance</str>
      <str name="spellcheckIndexDir">./spellchecker</str>

    </lst>

    <!-- This field type's analyzer is used by the QueryConverter to tokenize the value for "q" parameter -->
    <str name="queryAnalyzerFieldType">textSpell</str>
</searchComponent>
<!--
  The SpellingQueryConverter to convert raw (CommonParams.Q) queries into tokens.  Uses a simple regular expression
  to strip off field markup, boosts, ranges, etc. but it is not guaranteed to match an exact parse from the query parser.

  Optional, defaults to solr.SpellingQueryConverter
-->
<queryConverter name="queryConverter" class="solr.SpellingQueryConverter"/>

<!--  Add to a RequestHandler
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
NOTE:  YOU LIKELY DO NOT WANT A SEPARATE REQUEST HANDLER FOR THIS COMPONENT.  THIS IS DONE HERE SOLELY FOR
THE SIMPLICITY OF THE EXAMPLE.  YOU WILL LIKELY WANT TO BIND THE COMPONENT TO THE /select STANDARD REQUEST HANDLER.
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
-->


<requestHandler name="/spellCheckCompRH" class="solr.SearchHandler">
    <lst name="defaults">
      <!-- Optional, must match spell checker's name as defined above, defaults to "default" -->
      <str name="spellcheck.dictionary">default</str>
      <!-- omp = Only More Popular -->
      <str name="spellcheck.onlyMorePopular">false</str>
      <!-- exr = Extended Results -->
      <str name="spellcheck.extendedResults">false</str>
      <!--  The number of suggestions to return -->
      <str name="spellcheck.count">1</str>
    </lst>
<!--  Add to a RequestHandler
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
REPEAT NOTE:  YOU LIKELY DO NOT WANT A SEPARATE REQUEST HANDLER FOR THIS COMPONENT.  THIS IS DONE HERE SOLELY FOR
THE SIMPLICITY OF THE EXAMPLE.  YOU WILL LIKELY WANT TO BIND THE COMPONENT TO THE /select STANDARD REQUEST HANDLER.
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
-->
    <arr name="last-components">
      <str>spellcheck</str>
    </arr>
  </requestHandler>

This config sample from Solr Wiki ,
After adding this you can request to build spellchecker index

http://localhost:8983/solr/spell?q=some query&spellcheck=true&spellcheck.collate=true&spellcheck.build=true 

Note to not include the last part of the query in each request because this woill build the spelling index all time you request so
the previous becomes after the first request

http://localhost:8983/solr/spell?q=some query&spellcheck=true&spellcheck.collate=true

In the previous XML sextion son't forget to replace the field spell by the field on which you want to build your spellchecker against

And now you can feel the power of spellchecking

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文