SOLR 排名大的搜索词更高
我可能在这里遗漏了一些简单的东西。我的 SOLR 模式下有以下配置:
<analyzer type="index">
<!-- <charFilter class="solr.PatternReplaceCharFilterFactory" pattern="([^a-zA-Z0-9\s])" replacement=""/> -->
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.EdgeNGramFilterFactory" minGramSize="1" maxGramSize="100" />
</analyzer>
<analyzer type="query">
<!--<charFilter class="solr.PatternReplaceCharFilterFactory" pattern="([^a-zA-Z0-9\s])" replacement=""/> -->
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>
我的索引中有以下两个文档:
{
"app_name":"abd cad",
"id":"app:75751146",
"_version_":1727607795167002624}
{
"user_name":"ab cad",
"id":"user:75751146",
"_version_":1727607795167002624}
然后我尝试根据上面定义的 ngram 字段进行短语搜索:
app_name_ngram:"ab c" OR user_name_ngram:"ab c"
结果如下所示:
{
"app_name":"abd cad",
"id":"app:75751146",
"_version_":1727607791167002624}
{
"user_name":"ab cad",
"id":"app:75751146",
"_version_":1727607795167002624}
看起来 SOLR 正在排名app_name ngram 的分数高于 user_name ngram,我理想的情况是将其与提供的短语“ab c”进行匹配(有点像最短匹配优先?),以便包含用户名“ab cad”的文档首先出现的“abd CAD”。在SOLR中如何实现这一点?
I am probably missing something simple here. I have the following configuration under my SOLR schema:
<analyzer type="index">
<!-- <charFilter class="solr.PatternReplaceCharFilterFactory" pattern="([^a-zA-Z0-9\s])" replacement=""/> -->
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.EdgeNGramFilterFactory" minGramSize="1" maxGramSize="100" />
</analyzer>
<analyzer type="query">
<!--<charFilter class="solr.PatternReplaceCharFilterFactory" pattern="([^a-zA-Z0-9\s])" replacement=""/> -->
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>
And I have the following two documents in my index:
{
"app_name":"abd cad",
"id":"app:75751146",
"_version_":1727607795167002624}
{
"user_name":"ab cad",
"id":"user:75751146",
"_version_":1727607795167002624}
Then I am trying to do a phrase search based on ngram fields defined above:
app_name_ngram:"ab c" OR user_name_ngram:"ab c"
The results appear like the following:
{
"app_name":"abd cad",
"id":"app:75751146",
"_version_":1727607791167002624}
{
"user_name":"ab cad",
"id":"app:75751146",
"_version_":1727607795167002624}
It looks like SOLR is ranking the score of app_name ngram higher than the user_name ngram, what I would ideally want is to match it against the provided phrase "ab c" (kinda like shortest match first?) so that the document containing the username "ab cad" appears first instead of "abd cad". How to achieve this in SOLR?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论