Solr - 示例拼写检查器不工作

发布于 2025-01-03 07:12:28 字数 5509 浏览 0 评论 0原文

我已经为 Solr 附带的示例安装配置设置了拼写检查器。我在这里遵循了他们对拼写检查器的说明：[http://wiki.apache.org/solr/SpellCheckComponent][1]

我遇到的问题是，在完全遵循它之后，我仍然无法让它工作？

我构建时的响应 (http://localhost:8983/solr/spell?q=:&spellcheck.build=true&spellcheck.q=dell%20ultrahar&spellcheck=true)

看起来像如下：

<response>
    <lst name="responseHeader">
        <int name="status">0</int>
        <int name="QTime">14</int>
    </lst>
        <str name="command">build</str>
        <result name="response" numFound="17" start="0">
        ...
        </result>
        <lst name="spellcheck">
        <lst name="suggestions"/>
    </lst>
</response>

当我使用 http://localhost:8983/solr/spell?q=:&spellcheck.q=dell+ultrahar&spellcheck=true&spellcheck.extendedResults= true

我得到以下响应

<response>
    <lst name="responseHeader">
        <int name="status">0</int>
        <int name="QTime">1</int>
    </lst>
    <result name="response" numFound="17" start="0">
    ...
    </result>
    <lst name="spellcheck">
        <lst name="suggestions">
        <bool name="correctlySpelled">false</bool>
        </lst>
    </lst>
</response>

什么给出了？我的 schema.xml 中是否缺少某些内容？

schema.xml 位于： http://www.developermill.com/schema.xml

solrConfig.xml 在这里： http://www.developermill.com/solrconfig.xml

唯一对示例文件的更改是添加solrconfig.xml 中的以下内容：

 <searchComponent name="spellcheck" class="solr.SpellCheckComponent">

  <lst name="spellchecker">
    <!--
        Optional, it is required when more than one spellchecker is configured.
        Select non-default name with spellcheck.dictionary in request handler.
    -->
    <str name="name">default</str>
    <!-- The classname is optional, defaults to IndexBasedSpellChecker -->
    <str name="classname">solr.IndexBasedSpellChecker</str>
    <!--
        Load tokens from the following field for spell checking,
        analyzer for the field's type as defined in schema.xml are used
    -->
    <str name="field">spell</str>
    <!-- Optional, by default use in-memory index (RAMDirectory) -->
    <str name="spellcheckIndexDir">./spellchecker</str>
    <!-- Set the accuracy (float) to be used for the suggestions. Default is 0.5 -->
    <str name="accuracy">0.7</str>
    <!-- Require terms to occur in 1/100th of 1% of documents in order to be included in the dictionary -->
    <float name="thresholdTokenFrequency">.0001</float>
  </lst>
  <!-- Example of using different distance measure -->
  <lst name="spellchecker">
    <str name="name">jarowinkler</str>
    <str name="field">lowerfilt</str>
    <!-- Use a different Distance Measure -->
    <str name="distanceMeasure">org.apache.lucene.search.spell.JaroWinklerDistance</str>
    <str name="spellcheckIndexDir">./spellchecker</str>

  </lst>

  <!-- This field type's analyzer is used by the QueryConverter to tokenize the value for "q" parameter -->
  <str name="queryAnalyzerFieldType">textSpell</str>
</searchComponent>
<!--
    The SpellingQueryConverter to convert raw (CommonParams.Q) queries into tokens.  Uses a simple regular expression
    to strip off field markup, boosts, ranges, etc. but it is not guaranteed to match an exact parse from the query parser.

Optional, defaults to solr.SpellingQueryConverter
-->
<queryConverter name="queryConverter" class="solr.SpellingQueryConverter"/>

<!--  Add to a RequestHandler
     !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
     NOTE:  YOU LIKELY DO NOT WANT A SEPARATE REQUEST HANDLER FOR THIS COMPONENT.  THIS IS DONE HERE SOLELY FOR
     THE SIMPLICITY OF THE EXAMPLE.  YOU WILL LIKELY WANT TO BIND THE COMPONENT TO THE /select STANDARD REQUEST HANDLER.
     !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
-->
<requestHandler name="/spellCheckCompRH" class="solr.SearchHandler">
  <lst name="defaults">
    <!-- Optional, must match spell checker's name as defined above, defaults to "default" -->
    <str name="spellcheck.dictionary">default</str>
    <!-- omp = Only More Popular -->
    <str name="spellcheck.onlyMorePopular">false</str>
    <!-- exr = Extended Results -->
    <str name="spellcheck.extendedResults">false</str>
    <!--  The number of suggestions to return -->
    <str name="spellcheck.count">1</str>
  </lst>
  <!--  Add to a RequestHandler
       !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
       REPEAT NOTE:  YOU LIKELY DO NOT WANT A SEPARATE REQUEST HANDLER FOR THIS COMPONENT.  THIS IS DONE HERE SOLELY FOR
       THE SIMPLICITY OF THE EXAMPLE.  YOU WILL LIKELY WANT TO BIND THE COMPONENT TO THE /select STANDARD REQUEST HANDLER.
       !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
  -->
  <arr name="last-components">
    <str>spellcheck</str>
  </arr>
</requestHandler>

原文

I have set up the spellchecker for the example installation configuration that comes with Solr. I have followed their instructions for the spellchecker here: [http://wiki.apache.org/solr/SpellCheckComponent][1]

The problem I have is that after following it exactly I still cannot get it to work?

The response when I build (http://localhost:8983/solr/spell?q=:&spellcheck.build=true&spellcheck.q=delll%20ultrashar&spellcheck=true)

looks as follows:

<response>
    <lst name="responseHeader">
        <int name="status">0</int>
        <int name="QTime">14</int>
    </lst>
        <str name="command">build</str>
        <result name="response" numFound="17" start="0">
        ...
        </result>
        <lst name="spellcheck">
        <lst name="suggestions"/>
    </lst>
</response>

And when I query with http://localhost:8983/solr/spell?q=:&spellcheck.q=delll+ultrashar&spellcheck=true&spellcheck.extendedResults=true

I get the following response

<response>
    <lst name="responseHeader">
        <int name="status">0</int>
        <int name="QTime">1</int>
    </lst>
    <result name="response" numFound="17" start="0">
    ...
    </result>
    <lst name="spellcheck">
        <lst name="suggestions">
        <bool name="correctlySpelled">false</bool>
        </lst>
    </lst>
</response>

What gives? Am i missing something in my schema.xml?

The schema.xml is here: http://www.developermill.com/schema.xml

The solrConfig.xml is here: http://www.developermill.com/solrconfig.xml

The only change to the example files was the addition of the following in the solrconfig.xml:

 <searchComponent name="spellcheck" class="solr.SpellCheckComponent">

  <lst name="spellchecker">
    <!--
        Optional, it is required when more than one spellchecker is configured.
        Select non-default name with spellcheck.dictionary in request handler.
    -->
    <str name="name">default</str>
    <!-- The classname is optional, defaults to IndexBasedSpellChecker -->
    <str name="classname">solr.IndexBasedSpellChecker</str>
    <!--
        Load tokens from the following field for spell checking,
        analyzer for the field's type as defined in schema.xml are used
    -->
    <str name="field">spell</str>
    <!-- Optional, by default use in-memory index (RAMDirectory) -->
    <str name="spellcheckIndexDir">./spellchecker</str>
    <!-- Set the accuracy (float) to be used for the suggestions. Default is 0.5 -->
    <str name="accuracy">0.7</str>
    <!-- Require terms to occur in 1/100th of 1% of documents in order to be included in the dictionary -->
    <float name="thresholdTokenFrequency">.0001</float>
  </lst>
  <!-- Example of using different distance measure -->
  <lst name="spellchecker">
    <str name="name">jarowinkler</str>
    <str name="field">lowerfilt</str>
    <!-- Use a different Distance Measure -->
    <str name="distanceMeasure">org.apache.lucene.search.spell.JaroWinklerDistance</str>
    <str name="spellcheckIndexDir">./spellchecker</str>

  </lst>

  <!-- This field type's analyzer is used by the QueryConverter to tokenize the value for "q" parameter -->
  <str name="queryAnalyzerFieldType">textSpell</str>
</searchComponent>
<!--
    The SpellingQueryConverter to convert raw (CommonParams.Q) queries into tokens.  Uses a simple regular expression
    to strip off field markup, boosts, ranges, etc. but it is not guaranteed to match an exact parse from the query parser.

Optional, defaults to solr.SpellingQueryConverter
-->
<queryConverter name="queryConverter" class="solr.SpellingQueryConverter"/>

<!--  Add to a RequestHandler
     !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
     NOTE:  YOU LIKELY DO NOT WANT A SEPARATE REQUEST HANDLER FOR THIS COMPONENT.  THIS IS DONE HERE SOLELY FOR
     THE SIMPLICITY OF THE EXAMPLE.  YOU WILL LIKELY WANT TO BIND THE COMPONENT TO THE /select STANDARD REQUEST HANDLER.
     !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
-->
<requestHandler name="/spellCheckCompRH" class="solr.SearchHandler">
  <lst name="defaults">
    <!-- Optional, must match spell checker's name as defined above, defaults to "default" -->
    <str name="spellcheck.dictionary">default</str>
    <!-- omp = Only More Popular -->
    <str name="spellcheck.onlyMorePopular">false</str>
    <!-- exr = Extended Results -->
    <str name="spellcheck.extendedResults">false</str>
    <!--  The number of suggestions to return -->
    <str name="spellcheck.count">1</str>
  </lst>
  <!--  Add to a RequestHandler
       !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
       REPEAT NOTE:  YOU LIKELY DO NOT WANT A SEPARATE REQUEST HANDLER FOR THIS COMPONENT.  THIS IS DONE HERE SOLELY FOR
       THE SIMPLICITY OF THE EXAMPLE.  YOU WILL LIKELY WANT TO BIND THE COMPONENT TO THE /select STANDARD REQUEST HANDLER.
       !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
  -->
  <arr name="last-components">
    <str>spellcheck</str>
  </arr>
</requestHandler>

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

寂寞笑我太脆弱 2025-01-10 07:12:28

textSpell 字段定义位于错误的位置。以下片段应位于 schema.xml 内的 types 标记内：

<fieldType name="textSpell" class="solr.TextField" positionIncrementGap="100" omitNorms="true">
    <analyzer type="index">
        <tokenizer class="solr.StandardTokenizerFactory"/>
        <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.StandardFilterFactory"/>
    </analyzer>
    <analyzer type="query">
        <tokenizer class="solr.StandardTokenizerFactory"/>
        <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true"  expand="true"/>
        <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.StandardFilterFactory"/>
    </analyzer>
</fieldType>

修复该问题后，我想一切都应该可以正常工作，但我建议您继续工作稍微清理一下您的示例，因为它基本上包含您可以配置的所有内容。你应该只保留你真正需要的东西。

The textSpell field definition is in the wrong place. The following fragment should be within the types tag inside the schema.xml:

<fieldType name="textSpell" class="solr.TextField" positionIncrementGap="100" omitNorms="true">
    <analyzer type="index">
        <tokenizer class="solr.StandardTokenizerFactory"/>
        <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.StandardFilterFactory"/>
    </analyzer>
    <analyzer type="query">
        <tokenizer class="solr.StandardTokenizerFactory"/>
        <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true"  expand="true"/>
        <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.StandardFilterFactory"/>
    </analyzer>
</fieldType>

After you've fixed that, everything should work I guess, but I'd suggest you to work on cleaning up a little bit your example, since it basically contains everything you can configure. You should keep just what you really need.

回复收藏 0 原文

~没有更多了~