SOLR 排名大的搜索词更高

发布于 2025-01-15 13:14:26 字数 1722 浏览 2 评论 0原文

我可能在这里遗漏了一些简单的东西。我的 SOLR 模式下有以下配置:

                <analyzer type="index">
                        <!-- <charFilter class="solr.PatternReplaceCharFilterFactory" pattern="([^a-zA-Z0-9\s])" replacement=""/> -->
                        <tokenizer class="solr.StandardTokenizerFactory"/>
                        <filter class="solr.LowerCaseFilterFactory"/>
                        <filter class="solr.EdgeNGramFilterFactory" minGramSize="1" maxGramSize="100" />
                </analyzer>

                <analyzer type="query">
                        <!--<charFilter class="solr.PatternReplaceCharFilterFactory" pattern="([^a-zA-Z0-9\s])" replacement=""/> -->
                        <tokenizer class="solr.StandardTokenizerFactory"/>
                        <filter class="solr.LowerCaseFilterFactory"/>
                </analyzer>

        </fieldType>

我的索引中有以下两个文档:

{
        "app_name":"abd cad",
        "id":"app:75751146",
        "_version_":1727607795167002624}

{
        "user_name":"ab cad",
        "id":"user:75751146",
        "_version_":1727607795167002624}

然后我尝试根据上面定义的 ngram 字段进行短语搜索:

app_name_ngram:"ab c" OR user_name_ngram:"ab c"

结果如下所示:

{
        "app_name":"abd cad",
        "id":"app:75751146",
        "_version_":1727607791167002624}

{
        "user_name":"ab cad",
        "id":"app:75751146",
        "_version_":1727607795167002624}

看起来 SOLR 正在排名app_name ngram 的分数高于 user_name ngram,我理想的情况是将其与提供的短语“ab c”进行匹配(有点像最短匹配优先?),以便包含用户名“ab cad”的文档首先出现的“abd CAD”。在SOLR中如何实现这一点?

I am probably missing something simple here. I have the following configuration under my SOLR schema:

                <analyzer type="index">
                        <!-- <charFilter class="solr.PatternReplaceCharFilterFactory" pattern="([^a-zA-Z0-9\s])" replacement=""/> -->
                        <tokenizer class="solr.StandardTokenizerFactory"/>
                        <filter class="solr.LowerCaseFilterFactory"/>
                        <filter class="solr.EdgeNGramFilterFactory" minGramSize="1" maxGramSize="100" />
                </analyzer>

                <analyzer type="query">
                        <!--<charFilter class="solr.PatternReplaceCharFilterFactory" pattern="([^a-zA-Z0-9\s])" replacement=""/> -->
                        <tokenizer class="solr.StandardTokenizerFactory"/>
                        <filter class="solr.LowerCaseFilterFactory"/>
                </analyzer>

        </fieldType>

And I have the following two documents in my index:

{
        "app_name":"abd cad",
        "id":"app:75751146",
        "_version_":1727607795167002624}

{
        "user_name":"ab cad",
        "id":"user:75751146",
        "_version_":1727607795167002624}

Then I am trying to do a phrase search based on ngram fields defined above:

app_name_ngram:"ab c" OR user_name_ngram:"ab c"

The results appear like the following:

{
        "app_name":"abd cad",
        "id":"app:75751146",
        "_version_":1727607791167002624}

{
        "user_name":"ab cad",
        "id":"app:75751146",
        "_version_":1727607795167002624}

It looks like SOLR is ranking the score of app_name ngram higher than the user_name ngram, what I would ideally want is to match it against the provided phrase "ab c" (kinda like shortest match first?) so that the document containing the username "ab cad" appears first instead of "abd cad". How to achieve this in SOLR?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。
列表为空,暂无数据
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文