当前位置：文江博客话题详情

在应用 LIMIT 之前获取结果计数的最佳方法

发布于 2024-07-06 15:09:57 字数 228 浏览 17 评论 0原文

当对来自数据库的数据进行分页时，您需要知道将有多少页来呈现页面跳转控件。

目前，我通过运行查询两次来实现这一点，一次包含在 count() 中以确定总结果，第二次应用限制来仅返回当前页面所需的结果。

这看起来效率很低。是否有更好的方法来确定在应用 LIMIT 之前会返回多少结果？

我正在使用 PHP 和 Postgres。

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

好听的两个字的网名 2024-07-13 15:09:57

自 2008 年以来，纯 SQL

发生了变化。您可以使用窗口函数来在一次查询中获取完整计数和有限结果。 2009 年 PostgreSQL 8.4 引入。

SELECT foo
     , count(*) OVER() AS full_count
FROM   bar
WHERE  <some condition>
ORDER  BY <some col>
LIMIT  <pagesize>
OFFSET <offset>;

请注意，这可能比没有总数的情况下贵得多。必须对所有行进行计数，并且仅从匹配索引中获取顶部行的可能快捷方式可能不再有帮助。
对于小表或 full_count <= OFFSET + LIMIT 来说并不重要。对于更大的 full_count 来说很重要。

极端情况：当 OFFSET 至少与基本查询的行数一样大时，没有行< /strong> 返回。所以你也得不到full_count。可能的替代方案：

运行带有 LIMIT/OFFSET 的查询，并获取总行数

`SELECT` 查询中的事件序列

（0。CTE 单独评估和具体化。在Postgres 12 或更高版本的规划器可能会在开始工作之前内联子查询之类的内容。）这里不。

WHERE 子句（和 JOIN 条件，尽管您的示例中没有）从基表中过滤符合条件的行。 其余部分基于过滤的子集。

（2. GROUP BY 和聚合函数将位于此处。）不在这里。

（3.其他SELECT列表表达式根据分组/聚合列进行计算。）这里不。

窗口函数的应用取决于OVER子句和函数的框架规范。简单的 count(*) OVER() 基于所有符合条件的行。
排序依据

（6. DISTINCT 或 DISTINCT ON 将出现在此处。）不在这里。

根据既定顺序应用LIMIT / OFFSET来选择要返回的行。

随着表中行数的增加，LIMIT / OFFSET 变得越来越低效。如果您需要更好的性能，请考虑替代方法：

使用 OFFSET 优化查询大型表

获取最终计数的替代方案

有完全不同的方法来获取受影响行的计数（不是 OFFSET & 应用了LIMIT）。 Postgres 内部记录有多少行受到最后一个 SQL 命令的影响。某些客户端可以访问该信息或自行计算行数（例如 psql）。

例如，您可以在执行 SQL 命令后立即检索 plpgsql 中受影响的行数：

GET DIAGNOSTICS integer_var = ROW_COUNT;

手册中的详细信息。

或者您可以使用 PHP 中的pg_num_rows。或者其他客户端中的类似功能。

Pure SQL

Things have changed since 2008. You can use a window function to get the full count and the limited result in one query. Introduced with PostgreSQL 8.4 in 2009.

SELECT foo
     , count(*) OVER() AS full_count
FROM   bar
WHERE  <some condition>
ORDER  BY <some col>
LIMIT  <pagesize>
OFFSET <offset>;

Note that this can be considerably more expensive than without the total count. All rows have to be counted, and a possible shortcut taking just the top rows from a matching index may not be helpful any more.
Doesn't matter much with small tables or full_count <= OFFSET + LIMIT. Matters for a substantially bigger full_count.

Corner case: when OFFSET is at least as great as the number of rows from the base query, no row is returned. So you also get no full_count. Possible alternative:

Run a query with a LIMIT/OFFSET and also get the total number of rows

Sequence of events in a `SELECT` query

( 0. CTEs are evaluated and materialized separately. In Postgres 12 or later the planner may inline those like subqueries before going to work.) Not here.

WHERE clause (and JOIN conditions, though none in your example) filter qualifying rows from the base table(s). The rest is based on the filtered subset.

( 2. GROUP BY and aggregate functions would go here.) Not here.

( 3. Other SELECT list expressions are evaluated, based on grouped / aggregated columns.) Not here.

Window functions are applied depending on the OVER clause and the frame specification of the function. The simple count(*) OVER() is based on all qualifying rows.
ORDER BY

( 6. DISTINCT or DISTINCT ON would go here.) Not here.

LIMIT / OFFSET are applied based on the established order to select rows to return.

LIMIT / OFFSET becomes increasingly inefficient with a growing number of rows in the table. Consider alternative approaches if you need better performance:

Optimize query with OFFSET on large table

Alternatives to get final count

There are completely different approaches to get the count of affected rows (not the full count before OFFSET & LIMIT were applied). Postgres has internal bookkeeping how many rows where affected by the last SQL command. Some clients can access that information or count rows themselves (like psql).

For instance, you can retrieve the number of affected rows in plpgsql immediately after executing an SQL command with:

GET DIAGNOSTICS integer_var = ROW_COUNT;

Details in the manual.

Or you can use pg_num_rows in PHP. Or similar functions in other clients.

Calculate number of rows affected by batch query in PostgreSQL

回复收藏 0 原文

Hello爱情风 2024-07-13 15:09:57

正如我在我的博客上所描述的，MySQL 有一个功能称为 SQL_CALC_FOUND_ROWS。这消除了执行两次查询的需要，但它仍然需要完整地执行查询，即使 limit 子句允许它提前停止。

据我所知，PostgreSQL没有类似的功能。进行分页时需要注意的一件事（恕我直言，这是使用 LIMIT 的最常见的事情）：执行“OFFSET 1000 LIMIT 10”意味着数据库必须获取至少 1010 行，甚至如果它只给你 10。一种更高效的方法是记住前一行（在本例中为第 1000 行）排序所依据的行的值，并像这样重写查询：“... WHERE order_row >第 1000 个值限制 10”。优点是“order_row”很可能被索引（如果没有，你就会遇到问题）。缺点是，如果在页面视图之间添加新元素，这可能会有点不同步（但话又说回来，访问者可能无法观察到它，并且可能会带来很大的性能提升）。

回复收藏 0 原文