可按用户结果排序的可扩展全文搜索

发布于 2024-08-31 12:22:50 字数 713 浏览 7 评论 0原文

有哪些选项可用于创建可扩展的全文搜索以及需要按用户排序的结果?这是针对 PHP/MySQL(也适用于 Symfony/Doctrine,如果相关的话)。

在我们的例子中,我们有一个用户进行过的锻炼的数据库。用户之前完成的锻炼应显示在结果的顶部。他们进行锻炼的频率越高,其在搜索匹配中的排名就应该越高。如果有帮助,您可以假设我们提前知道用户完成锻炼的次数。

可能的解决方案

Sphinx - 使用Sphinx实现全文搜索,在MySQL中完成所有查询和排序。这看起来很有希望(并且有一个 Symfony 插件!),但我对此了解不多。

Lucene - 使用 Lucene 执行全文搜索并将用户的完成内容放入查询中。正如中所建议的那样这个堆栈溢出线程。或者,使用 Lucene 检索结果,然后在 PHP 中重新排序。然而,这两种解决方案似乎都很笨重,并且可能无法扩展,因为用户可能已经完成了数百次锻炼。

Mysql - 没有本机全文支持 (InnoDB),因此我们必须使用不可扩展的 LIKE 或 REGEX。

What options exist for creating a scalable, full text search with results that need to be sorted on a per user basis? This is for PHP/MySQL (Symfony/Doctrine as well, if relevant).

In our case, we have a database of workouts that have been performed by users. The workouts that the user has done before should appear at the top of the results. The more frequently they've done the workout, the higher it should appear in search matches. If it helps, you can assume we know the number of times a user has done a workout in advance.

Possible Solutions

Sphinx - Use Sphinx to implement full text search, do all the querying and sorting in MySQL. This seems promising (and there's a Symfony Plugin!) but I don't know much about it.

Lucene - Use Lucene to perform full text search and put the users' completions into the query. As is suggested in this Stack Overflow thread. Alternatively, use Lucene to retrieve the results, then reorder them in PHP. However, both solutions seem clunky and potentially unscalable as a user may have completed hundreds of workouts.

Mysql - No native full text support (InnoDB), so we'd have use LIKE or REGEX, which isn't scalable.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

他不在意 2024-09-07 12:22:50

MySQL 确实具有本机 FULLTEXT 支持,但仅在 MyISAM 表中。

对于大多数现实世界的任务,Sphinx 是最快的引擎。但是,它是外部索引,因此只能通过 cron 脚本及时更新。

通过使用 SphinxSESphinx 的可插入 MySQL 接口),您可以连接 MySQL 表和 Sphinx 一个查询中的索引。不过,更新仍然需要外部脚本。

由于执行的锻炼次数似乎经常变化,因此将其保留在 Sphinx 中需要花费太多精力来重建索引。

使用 SphinxSE,您可以编写类似于以下内容的查询:

SELECT  *
FROM    workouts w
JOIN    user_workouts uw
ON      uw.workout = w.id
WHERE   w.query = 'query query query;filter=user_id,$user_id'
        AND uw.user = $user_id
ORDER BY
        uw.times_performed DESC

MySQL does have a native FULLTEXT support, though only in MyISAM tables.

For most real-world tasks, Sphinx is the fastest engine. However, it is an external index, so it can only be updated on a timely basis with a cron script.

By using SphinxSE (a pluggable MySQL interface to Sphinx), you can join MySQL tables and Sphinx indexes in one query. Updating, though, will still require an external script.

Since the number of workouts performed seems to change frequently, keeping it in Sphinx would require too much effort on rebuilding the index.

With SphinxSE, you can write a query similar to that:

SELECT  *
FROM    workouts w
JOIN    user_workouts uw
ON      uw.workout = w.id
WHERE   w.query = 'query query query;filter=user_id,$user_id'
        AND uw.user = $user_id
ORDER BY
        uw.times_performed DESC
这个俗人 2024-09-07 12:22:50

我不确定为什么您认为使用 Lucene 是不可扩展的。每个用户数百次锻炼并不是需要处理的大量数据。

尝试使用 Solr/Lucene 作为搜索后端。它有一个 JSON/XML 接口,可以与您的 PHP 前端很好地配合。将用户完成的锻炼# 存储在数据库表中。当发出查询时,从 Solr 获取结果,然后您可以从数据库表中进行选择并在 PHP 代码中进行选择。应该足够快并且可扩展。使用 Solr,维护索引非常简单;只需向您的 Solr 服务器发出添加/更新/删除请求。

I'm not sure why you're assuming using Lucene would be unscalable. Hundreds of workouts per user is not a lot of data to deal with.

Try using Solr/Lucene for the search backend. It has a JSON/XML interface which will play nicely with your PHP frontend. Store a user's completed workout # in a database table. When a query is issued, take the results from Solr, and you can select from the database table and resort in PHP code. Should be plenty fast and scalable. With Solr, maintaining your index is dirt simple; just issue add/update/delete requests to your Solr server.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文