SolR :完整句子拼写检查

发布于 2024-11-30 12:41:30 字数 2175 浏览 7 评论 0原文

我正在尝试配置拼写检查器以自动完成查询中的完整句子。

我已经能够得到这个结果:

"american israel" :
-> “美国的东西”
-> “以色列的东西”

但我想要:

“美国以色列”:
-> “american israel Something”

这是我的 solrconfig.xml :

<searchComponent name="suggest_full" class="solr.SpellCheckComponent">
 <str name="queryAnalyzerFieldType">suggestTextFull</str>
 <lst name="spellchecker">
  <str name="name">suggest_full</str>
  <str name="classname">org.apache.solr.spelling.suggest.Suggester</str>
  <str name="lookupImpl">org.apache.solr.spelling.suggest.tst.TSTLookup</str>
  <str name="field">text_suggest_full</str>
  <str name="fieldType">suggestTextFull</str>
 </lst>
</searchComponent>

<requestHandler name="/suggest_full" class="org.apache.solr.handler.component.SearchHandler">
<lst name="defaults">
 <str name="echoParams">explicit</str>
 <str name="spellcheck">true</str>
 <str name="spellcheck.dictionary">suggest_full</str>
 <str name="spellcheck.count">10</str>
 <str name="spellcheck.onlyMorePopular">true</str>
</lst>
<arr name="last-components">
 <str>suggest_full</str>
</arr>
</requestHandler>

这是我的 schema.xml:

<fieldType name="suggestTextFull" class="solr.TextField">
  <analyzer type="index">  
    <tokenizer class="solr.KeywordTokenizerFactory"/>  
    <filter class="solr.LowerCaseFilterFactory"/>  
    <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
  </analyzer>
  <analyzer type="query">  
    <tokenizer class="solr.KeywordTokenizerFactory"/>  
    <filter class="solr.LowerCaseFilterFactory"/>  
  </analyzer>
</fieldType>

...

<field name="text_suggest_full" type="suggestTextFull" indexed="true" stored="false" multiValued="true"/>

我在某处读到我必须使用 spellcheck.q 因为 q 使用 WhitespaceAnalyzer,但是当我使用spellcheck.qi时得到一个java.lang.NullPointerException

有什么想法吗?

I'm trying to configure a spellchecker to autocomplete full sentences from my query.

I've already been able to get this results:

"american israel" :
-> "american something"
-> "israel something"

But i want :

"american israel" :
-> "american israel something"

This is my solrconfig.xml :

<searchComponent name="suggest_full" class="solr.SpellCheckComponent">
 <str name="queryAnalyzerFieldType">suggestTextFull</str>
 <lst name="spellchecker">
  <str name="name">suggest_full</str>
  <str name="classname">org.apache.solr.spelling.suggest.Suggester</str>
  <str name="lookupImpl">org.apache.solr.spelling.suggest.tst.TSTLookup</str>
  <str name="field">text_suggest_full</str>
  <str name="fieldType">suggestTextFull</str>
 </lst>
</searchComponent>

<requestHandler name="/suggest_full" class="org.apache.solr.handler.component.SearchHandler">
<lst name="defaults">
 <str name="echoParams">explicit</str>
 <str name="spellcheck">true</str>
 <str name="spellcheck.dictionary">suggest_full</str>
 <str name="spellcheck.count">10</str>
 <str name="spellcheck.onlyMorePopular">true</str>
</lst>
<arr name="last-components">
 <str>suggest_full</str>
</arr>
</requestHandler>

And this is my schema.xml:

<fieldType name="suggestTextFull" class="solr.TextField">
  <analyzer type="index">  
    <tokenizer class="solr.KeywordTokenizerFactory"/>  
    <filter class="solr.LowerCaseFilterFactory"/>  
    <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
  </analyzer>
  <analyzer type="query">  
    <tokenizer class="solr.KeywordTokenizerFactory"/>  
    <filter class="solr.LowerCaseFilterFactory"/>  
  </analyzer>
</fieldType>

...

<field name="text_suggest_full" type="suggestTextFull" indexed="true" stored="false" multiValued="true"/>

I've read somewhere that I have to use spellcheck.q because q use the WhitespaceAnalyzer, but when I use spellcheck.q i get a java.lang.NullPointerException

Any ideas ?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

初相遇 2024-12-07 12:41:30

如果您的拼写检查字段 ( text_suggest_full ) 包含 american Somethingisrael Something,请确保还存在一个 document/entry ,其值为美国以色列的东西

Solr 不会将 american Somethingisrael Something 合并为一个术语,并且不会将结果应用于 american israel 的拼写检查。

If you spellcheck fields ( text_suggest_full ) contain american something and israel something so make sure, that there also exist an document/entry , with the value american israel something.

Solr will not merge american something and israel something to one term and will not apply the result to your spellchecking for american israel.

作妖 2024-12-07 12:41:30

难道没有更合适的自动完成方法吗?请参阅这篇文章,例如

Wouldnt be there an autocomplete approach more suitable? See this article e.g.

冷清清 2024-12-07 12:41:30

您可以使用建议器/灵活的“自动完成”组件;
您必须拥有 solr 3.X 版本

SolrConfig.xml :

 <searchComponent name="suggest" class="solr.SpellCheckComponent">
    <lst name="spellchecker">
    <str name="name">suggest</str>
    <str name="classname">org.apache.solr.spelling.suggest.Suggester</str>
    <str name="lookupImpl">org.apache.solr.spelling.suggest.tst.TSTLookup</str>
    <str name="field">name_autocomplete</str>
    </lst>
    </searchComponent>


    <requestHandler name="/suggest" class="org.apache.solr.handler.component.SearchHandler">
    <lst name="defaults">
    <str name="spellcheck">true</str>
    <str name="spellcheck.dictionary">suggest</str>
    <str name="spellcheck.count">10</str>
    </lst>
    <arr name="components">
    <str>suggest</str>
    </arr>
    </requestHandler>

Shema.xml

<field name="name_autocomplete" type="text" indexed="true" stored="true" multiValued="false" />

添加 copyField

<copyField source="name" dest="name_autocomplete" />

重新加载 solr,重新索引所有内容并测试:
http://localhost:8983 /solr/suggest?q=&amerspellcheck=true&spellcheck.collat​​e=true&spellcheck.build=true

得到一些东西喜欢:

<?xml version="1.0" encoding="UTF-8"?>
<response>
  <lst name="spellcheck">
    <lst name="suggestions">
      <lst name="ameri">
        <int name="numFound">2</int>
        <int name="startOffset">0</int>
        <int name="endOffset">2</int>
        <arr name="suggestion">
          <str>american morocco</str>
          <str>american morocco something</str>
        </arr>
      </lst>
      <str name="collation">american morocco something</str>
    </lst>
  </lst>
</response>

希望有帮助

干杯

You can use the suggester / a flexible "autocomplete" component;
you must have version 3.X of solr

SolrConfig.xml :

 <searchComponent name="suggest" class="solr.SpellCheckComponent">
    <lst name="spellchecker">
    <str name="name">suggest</str>
    <str name="classname">org.apache.solr.spelling.suggest.Suggester</str>
    <str name="lookupImpl">org.apache.solr.spelling.suggest.tst.TSTLookup</str>
    <str name="field">name_autocomplete</str>
    </lst>
    </searchComponent>


    <requestHandler name="/suggest" class="org.apache.solr.handler.component.SearchHandler">
    <lst name="defaults">
    <str name="spellcheck">true</str>
    <str name="spellcheck.dictionary">suggest</str>
    <str name="spellcheck.count">10</str>
    </lst>
    <arr name="components">
    <str>suggest</str>
    </arr>
    </requestHandler>

Shema.xml

<field name="name_autocomplete" type="text" indexed="true" stored="true" multiValued="false" />

Add copyField

<copyField source="name" dest="name_autocomplete" />

Reload solr, reindex all and test :
http://localhost:8983/solr/suggest?q=&amerspellcheck=true&spellcheck.collate=true&spellcheck.build=true

Get something like :

<?xml version="1.0" encoding="UTF-8"?>
<response>
  <lst name="spellcheck">
    <lst name="suggestions">
      <lst name="ameri">
        <int name="numFound">2</int>
        <int name="startOffset">0</int>
        <int name="endOffset">2</int>
        <arr name="suggestion">
          <str>american morocco</str>
          <str>american morocco something</str>
        </arr>
      </lst>
      <str name="collation">american morocco something</str>
    </lst>
  </lst>
</response>

Hope that help

Cheers

温柔少女心 2024-12-07 12:41:30

恕我直言,拼写检查组件的一个问题是每个单词都会根据完整索引进行拼写检查。
拼写检查单词的“排序规则”不一定与索引中的单个文档匹配,但可能来自单独的索引文档。

IMHO, a problem with the spellcheck component is that each word is spell checked against the full index.
The "collation" of the spell checked words does not neccesary match an single document within the index, but might come from separate indexed documents.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文