Sphinx 根据排序返回不一致的结果集
我正在尝试为我正在开发的网络应用程序实现多语言索引。目前,记录有几种语言版本:英语、马来语和马来语。阿拉伯语(但它们没有分成不同的列)。目前仅启用英语词干分析器。
仅构建两个索引,即词干索引和非词干索引。我遇到了词干索引的问题,因为返回的结果集不一致,具体取决于排序列。
这两个查询(来自词干索引)每个都返回不同数量的总结果,尽管它们之间的区别只是排序顺序。
SELECT * FROM test1stemmed WHERE MATCH('@institution universiti') GROUP BY art_id ORDER BY art_title_ord ASC;
SELECT * FROM test1stemmed WHERE MATCH('@institution universiti') GROUP BY art_id ORDER BY art_title_ord DESC;
但是,如果在非词干索引上运行相同的查询,则结果数量相等。
我在使用 Sphinx PHP API 时也遇到同样的问题:
$sp = new SphinxClient();
$sp->SetServer('localhost', 9312);
$sp->SetMatchMode(SPH_MATCH_EXTENDED);
$sp->SetGroupBy('art_id', SPH_GROUPBY_ATTR, "$sp_sort_column $sort");
$sp->SetLimits($offset, $rows_per_page, 1000);
$sp->Query("$q", 'test1stemmed');
我缺少什么?
I'm trying to implement multilingual indexes for the web application I'm developing. At the moment, records exist in a few languages, English, Malay & Arabic (but they are not separated into different columns). Only English stemmer is currently enabled.
Only two indexes are built, for the stemmed and the non-stemmed indexes. I'm having the problem with the stemmed index, as the result set returned is not consistent, depending on the sort column.
These two queries (from the stemmed index), each returns a different number of total results, although the difference between them is only the sort order.
SELECT * FROM test1stemmed WHERE MATCH('@institution universiti') GROUP BY art_id ORDER BY art_title_ord ASC;
SELECT * FROM test1stemmed WHERE MATCH('@institution universiti') GROUP BY art_id ORDER BY art_title_ord DESC;
However, if the same queries were run on the non-stemmed index, the numbers of results are equal.
I'm also having the same problem with Sphinx PHP API:
$sp = new SphinxClient();
$sp->SetServer('localhost', 9312);
$sp->SetMatchMode(SPH_MATCH_EXTENDED);
$sp->SetGroupBy('art_id', SPH_GROUPBY_ATTR, "$sp_sort_column $sort");
$sp->SetLimits($offset, $rows_per_page, 1000);
$sp->Query("$q", 'test1stemmed');
What am I missing?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我在此处的文档中错过了一些东西 http://sphinxsearch.com/docs/2.0.2 /聚类.html
因此,我可以通过增加 max_matches 中的值来解决此问题,但由于放置非常大的值绝对是不可取的,因此我会修复查询。
Something that I missed from the documentation here http://sphinxsearch.com/docs/2.0.2/clustering.html
So I can workaround this by increasing the value in max_matches, but since putting a very large value is absolutely undesirable, I would fix the query instead.