具有递归性能的 PostgreSQL

发布于 2024-11-05 03:09:30 字数 190 浏览 3 评论 0原文

我有一个简单的问题。不知何故，我无法找到明确的答案。

PostgreSQL 中的 WITH RECURSIVE 语法优化了多少？我的意思是：它仅仅是一系列非递归查询的语法糖，还是它更像是一个单一的语句，尽管其复杂的语义已被整体优化。一个后续问题 - 关于这种语法可以优化多少？当然，关于这个问题的一些具体数据是最受欢迎的。

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

不甘平庸 2024-11-12 03:09:30

我发现它在一定程度上得到了优化。

各种子查询按预期重新使用并单独优化，Postgres 优化后者就像任何其他查询一样。

我对它的主要抱怨是它不会在可能的情况下向 CTE 注入约束。

例如：

with recursive
parents as (
select node.id,
       node.parent_id
from nodes as node
union all
select node.id,
       parent.parent_id
from parents as node
join nodes as parent on parent.id = node.parent_id
)
select parent_id
from parents
where id = 2;

Postgres 理想地理解，在上面，（因为 node.id 按原样返回）它可以执行以下操作：

with recursive
parents as (
select node.id,
       node.parent_id
from nodes as node
where id = 2
union all
select node.id,
       parent.parent_id
from parents as node
join nodes as parent on parent.id = node.parent_id
)
select parent_id
from parents;

...并对主键使用索引扫描。实际上，它实际上会按照 CTE 的指示执行操作：递归地提取所有行的所有父项，如果需要，将结果集放入未命名的临时表中，然后检查结果集中的每一行是否有 id = 2.

换句话说，CTE 不会跟踪它返回的“原始”表/行/列集。在得到正确优化之前，在递归查询上创建视图充其量是疯狂的。

同时，一个好的解决方法是创建一个 sql 函数：

create function parents(id int) as returns table (id int) $
    with recursive
    parents as (
    select node.id,
           node.parent_id
    from nodes as node
    where id = $1
    union all
    select node.id,
           parent.parent_id
    from parents as node
    join nodes as parent on parent.id = node.parent_id
    )
    select parent_id
    from parents;
$ language sql stable strict rows 5 cost 1;

另一个问题是不能将 FOR UPDATE 与递归 CTE 一起使用（事实上，出于几乎相同的原因）。

I've found it optimized up to a point.

The various subqueries are re-used as expected and are optimized individually, and Postgres optimizes the latter just like any other query.

My main gripe with it has to do with that it won't inject constraints into the CTEs when it could.

For instance:

with recursive
parents as (
select node.id,
       node.parent_id
from nodes as node
union all
select node.id,
       parent.parent_id
from parents as node
join nodes as parent on parent.id = node.parent_id
)
select parent_id
from parents
where id = 2;

Postgres would ideally understand, in the above, that (since node.id is returned as is) it can do:

with recursive
parents as (
select node.id,
       node.parent_id
from nodes as node
where id = 2
union all
select node.id,
       parent.parent_id
from parents as node
join nodes as parent on parent.id = node.parent_id
)
select parent_id
from parents;

... and use an index scan on the primary key. In practice, it'll actually do exactly when the CTE tells it to do: recursively pull all parents for all rows, place the result set in an unnamed temporary table if needed, and then check each row from the result set one for id = 2.

In other words, a CTE does not keep a trace of the "originating" table/row/column set that it's returning. Until this gets optimized properly, creating a view on a recursive query is crazy at best.

A good workaround in the meanwhile is to create an sql function instead:

create function parents(id int) as returns table (id int) $
    with recursive
    parents as (
    select node.id,
           node.parent_id
    from nodes as node
    where id = $1
    union all
    select node.id,
           parent.parent_id
    from parents as node
    join nodes as parent on parent.id = node.parent_id
    )
    select parent_id
    from parents;
$ language sql stable strict rows 5 cost 1;

Another issue is you can't use FOR UPDATE with recursive CTEs (for very much the same reason, in fact).

回复收藏 0 原文