Sphinx 搜索完全匹配然后中缀匹配
我正在使用 Sphinx 向网站提供搜索,但在返回相关内容时遇到了一些障碍结果。
为了让我的问题简单,我们假设我有两个字段,@title 和 @body,它们的权重分别为 100 和 100。分别为 15 个。当我搜索像“in”这样的小词时,我希望它将该搜索词的精确匹配排名更高,然后然后检查与“”的匹配in*|*in|*in*' 并将它们排名稍低一些。有没有办法让您的搜索具有这种类型的特异性?
“in”的示例结果:
- 印度食品
- 中间
- In关于拉丁语的
文档一些相关设置是:
In sphinx.conf:
morphology = stem_en
charset_type = utf-8
min_word_len = 2
min_prefix_len = 0
min_infix_len = 2
enable_star = 1
In search.php
$sp->SetMatchMode( SPH_MATCH_EXTENDED2 );
$sp->SetRankingMode( SPH_RANK_PROXIMITY_BM25 );
$sp->SetFieldWeights ( array('title' => 100, 'body' => 15) );
另外,作为旁注:我也遇到过一些情况,部分匹配甚至没有显示在搜索结果中。例如,我搜索了Cow,但是Cowboy没有显示在结果中。我还搜索了 Cowb 和 Cowbo,直到输入 Cowboy 才收到预期结果。有什么想法吗?
这个问题与这个上一个问题相同,但我希望我已经给出了一些关于我的问题以及我试图保证解决方案的更多细节。
I am using Sphinx to provide search to a website and I've run across a bit of a snag when returning relevant results.
To keep my question simple, let's assume that I have two fields, @title and @body, which are weighted 100 & 15 respectively. When I search for small words like the word 'in' I would like to have it rank exact matches for that search term higher and then check for matches to 'in*|*in|*in*' and rank them slightly lower. Is there any way to have this type of specificity for your searches?
Example results for 'in':
- Indian Food
- In The Middle
- Document about Latin
Some relevant settings are:
In sphinx.conf:
morphology = stem_en
charset_type = utf-8
min_word_len = 2
min_prefix_len = 0
min_infix_len = 2
enable_star = 1
In search.php
$sp->SetMatchMode( SPH_MATCH_EXTENDED2 );
$sp->SetRankingMode( SPH_RANK_PROXIMITY_BM25 );
$sp->SetFieldWeights ( array('title' => 100, 'body' => 15) );
Also, as a side note: I've also had some instances where partial matches don't even show up in the search results. For example, I have searched for Cow but Cowboy does not show up as a result. I have also searched for Cowb and Cowbo and it wasn't until I typed Cowboy that I received the expected result. Any thoughts?
This question is along the same lines as this previous SO question, but I hope I've given a little more detail as to my problem and the things I've tried to warrant a solution.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
从形态上看,牛与牛仔没有关系。
您可以通过两种方式解决它:
考虑到“in”和“in”的不同排名,我建议在索引中有两个正文字段,比方说:body 和 body_star 具有与 body 字段相同的内容。
在 search.php 中
这应该可以解决问题。
Looks like morphologically Cow not related to Cowboy.
You could solve it in two ways:
Regard different ranking for "in" and "in" I could suggest to have two body fields in index, lets say: body and body_star with the same content from body field.
in search.php
This should do the trick.
您也可以在配置中设置 Expand_keywords 选项
http://sphinxsearch.com/docs/1.10/conf-expand-keywords.html
并将排名模式设置为SPH_RANK_SPH04
http://sphinxsearch.com/blog/2010/ 08/17/sphinx-relevance-ranking-works/
Also you could set expand_keywords option in your config
http://sphinxsearch.com/docs/1.10/conf-expand-keywords.html
and set ranking mode to SPH_RANK_SPH04
http://sphinxsearch.com/blog/2010/08/17/how-sphinx-relevance-ranking-works/