当前位置：文江博客话题详情

solr不返回确切的元素

发布于 2025-01-30 11:42:44 字数 274 浏览 2 评论 0原文

使用Solr 7.7.3 我有一个标签的元素：“ alpha-ravi” 当我在solr标签中搜索时：“ alpha”它用标签“ alpha-ravi”返回元素查看solr文档时，它不应返回此元素。谁能解释为什么这种行为？

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

吻泪 2025-02-06 11:42:44

如果要检索确切的结果（即仅在用户在搜索中键入确切的“ alpha-ravi”的返回文档，则只有在搜索中），那么我建议您可以使用keyword tokenizer（solr.keywordtokenizerfactory ）。这个令牌机会将整个“ alpha-ravi”视为一个令牌，因此，如果有“ alpha”或“ ravi”的匹配，将不会返回部分结果。

例如：在您的schema.xml文件中，您应该添加类似（根据需要配置各种过滤链）

 <fieldType name="single_token_string" class="solr.TextField" sortMissingLast="true">
      <analyzer type="index">
        <tokenizer class="solr.KeywordTokenizerFactory"/>
        <filter class="solr.ASCIIFoldingFilterFactory"/>
        <filter class="solr.TrimFilterFactory"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
      </analyzer>
      <analyzer type="query">
        <tokenizer class="solr.KeywordTokenizerFactory"/>
        <filter class="solr.ASCIIFoldingFilterFactory"/>
        <filter class="solr.TrimFilterFactory"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
      </analyzer>
    </fieldType>

，然后您可以在同一schema.xml中使用此fieldType（默认情况下我们刚刚定义的关键wordingtokenizer），

<field name="myField" type="single_token_string" indexed="true" stored="true" />

默认情况下，solr，solr使用标准词，因此将该连字符上的“ alpha-ravi”分解为多个令牌（因此，将“ alpha”和“ ravi”匹配）。

另外，作为替代方案，您也可以使用短语运行查询（不会在空格/定界符上进行标记）。可能是http：localhost：8983/solr/... fq =标签：“ alpha-ravi”

希望会有所帮助。一切顺利！

If you want to retrieve the exact results (i.e return docs with "alpha-ravi" only if the user types the exact "alpha-ravi" in the search), then I would suggest you could go with the Keyword tokenizer (solr.KeywordTokenizerFactory). This tokenizer would treat the entire "alpha-ravi" as a single token and thus, will not return partial results if there's a match for "alpha" or "ravi".

For example: in your schema.xml file you should add something like (configure the various filter chains as per your need)

 <fieldType name="single_token_string" class="solr.TextField" sortMissingLast="true">
      <analyzer type="index">
        <tokenizer class="solr.KeywordTokenizerFactory"/>
        <filter class="solr.ASCIIFoldingFilterFactory"/>
        <filter class="solr.TrimFilterFactory"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
      </analyzer>
      <analyzer type="query">
        <tokenizer class="solr.KeywordTokenizerFactory"/>
        <filter class="solr.ASCIIFoldingFilterFactory"/>
        <filter class="solr.TrimFilterFactory"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
      </analyzer>
    </fieldType>

And then you can use this fieldType in the same schema.xml (referencing the KeywordTokenizer we just defined)

<field name="myField" type="single_token_string" indexed="true" stored="true" />

By default, Solr uses the StandardTokenizer and thus, splits "alpha-ravi" on that hyphen into multiple tokens (thus, matching "alpha" and "ravi").

Also, as an alternative you could run a query with a phrase as well (which will not be tokenized on spaces/delimiters). Possibly something likehttp:localhost:8983/solr/...fq=label:"alpha-ravi"

Hope that helps. All the best!

回复收藏 0 原文

~没有更多了~