使用 solr 从段落中搜索具有大写字母的单词

发布于 2025-01-07 08:37:41 字数 912 浏览 0 评论 0原文

我正在使用 solr 进行搜索。当我搜索描述中包含大写字母的单词时

,它不显示任何结果。但它给出了小写字母的结果..

例如:如果我的查询是 q=description:* stack * ,我将得到结果。但是,如果查询是

q=description:* Stack * ,即使描述包含该单词,它也不会给出任何结果

我的架构包含:

<fieldType name="string" class="solr.TextField">
 <analyzer type="index">
  <tokenizer class="solr.KeywordTokenizerFactory"/>
  <filter class="solr.ASCIIFoldingFilterFactory"/>
  <filter class="solr.LowerCaseFilterFactory" />
  <filter class="solr.ReversedWildcardFilterFactory" />
 </analyzer>
 <analyzer type="query">
  <tokenizer class="solr.KeywordTokenizerFactory"/>
  <filter class="solr.ASCIIFoldingFilterFactory"/>
  <filter class="solr.LowerCaseFilterFactory" />
   <filter class="solr.ReversedWildcardFilterFactory" />
    </analyzer>
</fieldType>

我也想用大写字母进行搜索..

有人可以帮助我吗?

I am using solr for searching. When i search a word contains uppercase letters from

description, its not showing any result. But it gives result for lowercase letters ..

Eg: If my query is q=description:* stack * , i will get the result . But if query is

q=description:* Stack * , it wont give any result evenif description contains that word

My schema contains :

<fieldType name="string" class="solr.TextField">
 <analyzer type="index">
  <tokenizer class="solr.KeywordTokenizerFactory"/>
  <filter class="solr.ASCIIFoldingFilterFactory"/>
  <filter class="solr.LowerCaseFilterFactory" />
  <filter class="solr.ReversedWildcardFilterFactory" />
 </analyzer>
 <analyzer type="query">
  <tokenizer class="solr.KeywordTokenizerFactory"/>
  <filter class="solr.ASCIIFoldingFilterFactory"/>
  <filter class="solr.LowerCaseFilterFactory" />
   <filter class="solr.ReversedWildcardFilterFactory" />
    </analyzer>
</fieldType>

I want to search with upper case letters also..

Can someone help me ?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

森末i 2025-01-14 08:37:41

查看 Solr wiki。它说:

将此过滤器添加到索引分析器,但不添加到查询分析器。

在更改架构以反映 wiki 说明后,尝试使用 debugQuery=on 进行查询:

<str name="querystring">text:*Stack*</str>
<str name="parsedquery">text:#1;*kcatS*</str>

如您所见,ReversedWildcardFilterFactory 会更改您的查询,即使它不在您的查询中分析器链,其 fieldType 如下所示:

<fieldType name="text" class="solr.TextField">
    <analyzer type="index">
        <tokenizer class="solr.KeywordTokenizerFactory"/>
        <filter class="solr.ASCIIFoldingFilterFactory"/>
        <filter class="solr.LowerCaseFilterFactory"/>           
        <filter class="solr.ReversedWildcardFilterFactory" />       
    </analyzer>
    <analyzer type="query">
        <tokenizer class="solr.KeywordTokenizerFactory"/>       
        <filter class="solr.ASCIIFoldingFilterFactory"/>
        <filter class="solr.LowerCaseFilterFactory"/>               
    </analyzer>
</fieldType>

此外,不会为您的查询触发 LowerCaseFilterFactoryS 在解析的查询中不是小写)。 ASCIIFoldingFilterFactory 也会发生同样的情况。
看看这里了解更多:

Solr 不分析包含通配符的查询。是的,这个
意味着过滤器 LowerCaseFilterFactory 在索引期间,
将堆栈变为堆栈,但是在进行查询时,尽管
事实上,过滤器的定义是正确的。这就是为什么你不这样做
获取任何搜索结果。

我想到的最简单的解决方案是在将查询发送到 Solr 之前在客户端将其设置为小写。您还应该考虑 ASCIIFoldingFilterFactory 也不会被触发。你真的需要它吗?

Have a look at the Solr wiki. It says:

Add this filter to the index analyzer, but not the query analyzer.

Try querying with debugQuery=on after you've changed the schema to reflect the wiki instructions:

<str name="querystring">text:*Stack*</str>
<str name="parsedquery">text:#1;*kcatS*</str>

As you can see, the ReversedWildcardFilterFactory changes your query even if it's not in your query analyzer chain, with a fieldType like this:

<fieldType name="text" class="solr.TextField">
    <analyzer type="index">
        <tokenizer class="solr.KeywordTokenizerFactory"/>
        <filter class="solr.ASCIIFoldingFilterFactory"/>
        <filter class="solr.LowerCaseFilterFactory"/>           
        <filter class="solr.ReversedWildcardFilterFactory" />       
    </analyzer>
    <analyzer type="query">
        <tokenizer class="solr.KeywordTokenizerFactory"/>       
        <filter class="solr.ASCIIFoldingFilterFactory"/>
        <filter class="solr.LowerCaseFilterFactory"/>               
    </analyzer>
</fieldType>

Furthermore, the LowerCaseFilterFactory is not fired for your query (the S is not lowercase in the parsed query). The same happens for ASCIIFoldingFilterFactory.
Have a look here to know more:

Solr does not analyze queries in which there are wildcards. Yes, this
means that the filter LowerCaseFilterFactory, during indexing,
turns Stack to stack but when making queries this is not happening, despite the
fact that the filters are defined correctly. And that is why you don't
get any search results.

The easiest solution that comes in my mind is making your queries lowercase on client side, before sending them to Solr. You should also consider that the ASCIIFoldingFilterFactory is not fired either. Do you really need it?

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文