Sphinx 搜索完全匹配然后中缀匹配

发布于 2024-12-01 11:02:12 字数 1268 浏览 1 评论 0原文

我正在使用 Sphinx 向网站提供搜索,但在返回相关内容时遇到了一些障碍结果。

为了让我的问题简单,我们假设我有两个字段,@title 和 @body,它们的权重分别为 100 和 100。分别为 15 个。当我搜索像“in”这样的小词时,我希望它将该搜索词的精确匹配排名更高,然后然后检查与“”的匹配in*|*in|*in*' 并将它们排名稍低一些。有没有办法让您的搜索具有这种类型的特异性?

in”的示例结果:

  1. 印度食品
  2. 中间
  3. In关于拉丁语的

文档一些相关设置是:

In sphinx.conf

morphology              = stem_en
charset_type            = utf-8
min_word_len            = 2
min_prefix_len          = 0
min_infix_len           = 2
enable_star             = 1

In search.php

$sp->SetMatchMode( SPH_MATCH_EXTENDED2 );
$sp->SetRankingMode( SPH_RANK_PROXIMITY_BM25 );
$sp->SetFieldWeights ( array('title' => 100, 'body' => 15) );

另外,作为旁注:我也遇到过一些情况,部分匹配甚至没有显示在搜索结果中。例如,我搜索了Cow,但是Cowboy没有显示在结果中。我还搜索了 CowbCowbo,直到输入 Cowboy 才收到预期结果。有什么想法吗?


这个问题与这个上一个问题相同,但我希望我已经给出了一些关于我的问题以及我试图保证解决方案的更多细节。

I am using Sphinx to provide search to a website and I've run across a bit of a snag when returning relevant results.

To keep my question simple, let's assume that I have two fields, @title and @body, which are weighted 100 & 15 respectively. When I search for small words like the word 'in' I would like to have it rank exact matches for that search term higher and then check for matches to 'in*|*in|*in*' and rank them slightly lower. Is there any way to have this type of specificity for your searches?

Example results for 'in':

  1. Indian Food
  2. In The Middle
  3. Document about Latin

Some relevant settings are:

In sphinx.conf:

morphology              = stem_en
charset_type            = utf-8
min_word_len            = 2
min_prefix_len          = 0
min_infix_len           = 2
enable_star             = 1

In search.php

$sp->SetMatchMode( SPH_MATCH_EXTENDED2 );
$sp->SetRankingMode( SPH_RANK_PROXIMITY_BM25 );
$sp->SetFieldWeights ( array('title' => 100, 'body' => 15) );

Also, as a side note: I've also had some instances where partial matches don't even show up in the search results. For example, I have searched for Cow but Cowboy does not show up as a result. I have also searched for Cowb and Cowbo and it wasn't until I typed Cowboy that I received the expected result. Any thoughts?


This question is along the same lines as this previous SO question, but I hope I've given a little more detail as to my problem and the things I've tried to warrant a solution.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

凉宸 2024-12-08 11:02:12

从形态上看,牛与牛仔没有关系。

您可以通过两种方式解决它:

  1. 使用 wordforms 文件与 Cow > >牛仔
  2. 启用星号后,您可以将查询从“Cow”更改为“Cow*”,这将找到以“Cow”开头的所有单词。

考虑到“in”和“in”的不同排名,我建议在索引中有两个正文字段,比方说:body 和 body_star 具有与 body 字段相同的内容。

在 search.php 中

$sp->SetRankingMode( SPH_RANK_PROXIMITY_BM25 );
$sp->SetMatchingMode( SPH_MATCH_EXTENDED2 );
$sp->SetFieldWeights ( array('title' => 20, 'body' => 15, 'body_start' => 5) );
$sp->Query("@body in @body_star *in* @title in");

这应该可以解决问题。

Looks like morphologically Cow not related to Cowboy.

You could solve it in two ways:

  1. Use wordforms file with Cow > Cowboy
  2. As star is enabled you could change query from "Cow" to "Cow*" which will find all words starting with "Cow".

Regard different ranking for "in" and "in" I could suggest to have two body fields in index, lets say: body and body_star with the same content from body field.

in search.php

$sp->SetRankingMode( SPH_RANK_PROXIMITY_BM25 );
$sp->SetMatchingMode( SPH_MATCH_EXTENDED2 );
$sp->SetFieldWeights ( array('title' => 20, 'body' => 15, 'body_start' => 5) );
$sp->Query("@body in @body_star *in* @title in");

This should do the trick.

心不设防 2024-12-08 11:02:12

您也可以在配置中设置 Expand_keywords 选项
http://sphinxsearch.com/docs/1.10/conf-expand-keywords.html
并将排名模式设置为SPH_RANK_SPH04
http://sphinxsearch.com/blog/2010/ 08/17/sphinx-relevance-ranking-works/

Also you could set expand_keywords option in your config
http://sphinxsearch.com/docs/1.10/conf-expand-keywords.html
and set ranking mode to SPH_RANK_SPH04
http://sphinxsearch.com/blog/2010/08/17/how-sphinx-relevance-ranking-works/

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文