solr不返回确切的元素

发布于 2025-01-30 11:42:44 字数 274 浏览 2 评论 0原文

使用Solr 7.7.3 我有一个标签的元素:“ alpha-ravi” 当我在solr标签中搜索时:“ alpha”它用标签“ alpha-ravi”返回元素 查看solr文档时,它不应返回此元素。 谁能解释为什么这种行为?

Using Solr 7.7.3
I have an element with the label:"alpha-ravi"
and when I search in solr label:"alpha" its returning the element with the label "alpha-ravi"
when looking at the solr doc, it should not return this element.
can anyone explain why this behavior ?
enter image description here

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

吻泪 2025-02-06 11:42:44

如果要检索确切的结果(即仅在用户在搜索中键入确切的“ alpha-ravi”的返回文档,则只有在搜索中),那么我建议您可以使用keyword tokenizer(solr.keywordtokenizerfactory )。这个令牌机会将整个“ alpha-ravi”视为一个令牌,因此,如果有“ alpha”或“ ravi”的匹配,将不会返回部分结果。

例如:在您的schema.xml文件中,您应该添加类似(根据需要配置各种过滤链)

 <fieldType name="single_token_string" class="solr.TextField" sortMissingLast="true">
      <analyzer type="index">
        <tokenizer class="solr.KeywordTokenizerFactory"/>
        <filter class="solr.ASCIIFoldingFilterFactory"/>
        <filter class="solr.TrimFilterFactory"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
      </analyzer>
      <analyzer type="query">
        <tokenizer class="solr.KeywordTokenizerFactory"/>
        <filter class="solr.ASCIIFoldingFilterFactory"/>
        <filter class="solr.TrimFilterFactory"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
      </analyzer>
    </fieldType>

,然后您可以在同一schema.xml中使用此fieldType(默认情况下我们刚刚定义的关键wordingtokenizer),

<field name="myField" type="single_token_string" indexed="true" stored="true" />

默认情况下,solr,solr使用标准词,因此将该连字符上的“ alpha-ravi”分解为多个令牌(因此,将“ alpha”和“ ravi”匹配)。

另外,作为替代方案,您也可以使用短语运行查询(不会在空格/定界符上进行标记)。可能是http:localhost:8983/solr/... fq =标签:“ alpha-ravi”

希望会有所帮助。一切顺利!

If you want to retrieve the exact results (i.e return docs with "alpha-ravi" only if the user types the exact "alpha-ravi" in the search), then I would suggest you could go with the Keyword tokenizer (solr.KeywordTokenizerFactory). This tokenizer would treat the entire "alpha-ravi" as a single token and thus, will not return partial results if there's a match for "alpha" or "ravi".

For example: in your schema.xml file you should add something like (configure the various filter chains as per your need)

 <fieldType name="single_token_string" class="solr.TextField" sortMissingLast="true">
      <analyzer type="index">
        <tokenizer class="solr.KeywordTokenizerFactory"/>
        <filter class="solr.ASCIIFoldingFilterFactory"/>
        <filter class="solr.TrimFilterFactory"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
      </analyzer>
      <analyzer type="query">
        <tokenizer class="solr.KeywordTokenizerFactory"/>
        <filter class="solr.ASCIIFoldingFilterFactory"/>
        <filter class="solr.TrimFilterFactory"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
      </analyzer>
    </fieldType>

And then you can use this fieldType in the same schema.xml (referencing the KeywordTokenizer we just defined)

<field name="myField" type="single_token_string" indexed="true" stored="true" />

By default, Solr uses the StandardTokenizer and thus, splits "alpha-ravi" on that hyphen into multiple tokens (thus, matching "alpha" and "ravi").

Also, as an alternative you could run a query with a phrase as well (which will not be tokenized on spaces/delimiters). Possibly something likehttp:localhost:8983/solr/...fq=label:"alpha-ravi"

Hope that helps. All the best!

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文