Postgres 会将 WHERE 子句下推到带有窗口函数(聚合)的 VIEW 中吗?

发布于 2024-12-06 14:46:39 字数 4552 浏览 3 评论 0原文

Pg 的 Window 函数的文档说

窗口函数考虑的行是查询的 FROM 子句生成的“虚拟表”的行,并通过其 WHERE、GROUP BY 和 HAVING 子句(如果有)进行过滤。 例如,由于不满足 WHERE 条件而被删除的行不会被任何窗口函数看到。一个查询可以包含多个窗口函数,这些窗口函数通过不同的 OVER 子句以不同的方式对数据进行切片,但它们都作用于该虚拟表定义的同一行集合。

但是,我没有看到这一点。在我看来,选择过滤器非常靠近左边距和顶部(最后完成的事情)。

=# EXPLAIN SELECT * FROM chrome_nvd.view_options where fkey_style = 303451;
                                                      QUERY PLAN                                                      
----------------------------------------------------------------------------------------------------------------------
 Subquery Scan view_options  (cost=2098450.26..2142926.28 rows=14825 width=180)
   Filter: (view_options.fkey_style = 303451)
   ->  Sort  (cost=2098450.26..2105862.93 rows=2965068 width=189)
         Sort Key: o.sequence
         ->  WindowAgg  (cost=1446776.02..1506077.38 rows=2965068 width=189)
               ->  Sort  (cost=1446776.02..1454188.69 rows=2965068 width=189)
                     Sort Key: h.name, k.name
                     ->  WindowAgg  (cost=802514.45..854403.14 rows=2965068 width=189)
                           ->  Sort  (cost=802514.45..809927.12 rows=2965068 width=189)
                                 Sort Key: h.name
                                 ->  Hash Join  (cost=18.52..210141.57 rows=2965068 width=189)
                                       Hash Cond: (o.fkey_opt_header = h.id)
                                       ->  Hash Join  (cost=3.72..169357.09 rows=2965068 width=166)
                                             Hash Cond: (o.fkey_opt_kind = k.id)
                                             ->  Seq Scan on options o  (cost=0.00..128583.68 rows=2965068 width=156)
                                             ->  Hash  (cost=2.21..2.21 rows=121 width=18)
                                                   ->  Seq Scan on opt_kind k  (cost=0.00..2.21 rows=121 width=18)
                                       ->  Hash  (cost=8.80..8.80 rows=480 width=31)
                                             ->  Seq Scan on opt_header h  (cost=0.00..8.80 rows=480 width=31)
(19 rows)

这两个 WindowAgg 实质上将计划更改为似乎永远无法完成的更快的事情

                                                                       QUERY PLAN                                                                       
--------------------------------------------------------------------------------------------------------------------------------------------------------
 Subquery Scan view_options  (cost=329.47..330.42 rows=76 width=164) (actual time=20.263..20.403 rows=42 loops=1)
   ->  Sort  (cost=329.47..329.66 rows=76 width=189) (actual time=20.258..20.300 rows=42 loops=1)
         Sort Key: o.sequence
         Sort Method:  quicksort  Memory: 35kB
         ->  Hash Join  (cost=18.52..327.10 rows=76 width=189) (actual time=19.427..19.961 rows=42 loops=1)
               Hash Cond: (o.fkey_opt_header = h.id)
               ->  Hash Join  (cost=3.72..311.25 rows=76 width=166) (actual time=17.679..18.085 rows=42 loops=1)
                     Hash Cond: (o.fkey_opt_kind = k.id)
                     ->  Index Scan using options_pkey on options o  (cost=0.00..306.48 rows=76 width=156) (actual time=17.152..17.410 rows=42 loops=1)
                           Index Cond: (fkey_style = 303451)
                     ->  Hash  (cost=2.21..2.21 rows=121 width=18) (actual time=0.432..0.432 rows=121 loops=1)
                           ->  Seq Scan on opt_kind k  (cost=0.00..2.21 rows=121 width=18) (actual time=0.042..0.196 rows=121 loops=1)
               ->  Hash  (cost=8.80..8.80 rows=480 width=31) (actual time=1.687..1.687 rows=480 loops=1)
                     ->  Seq Scan on opt_header h  (cost=0.00..8.80 rows=480 width=31) (actual time=0.030..0.748 rows=480 loops=1)
 Total runtime: 20.893 ms
(15 rows)

发生了什么,我该如何修复它?我正在使用 Postgresql 8.4.8。这是实际视图正在执行的操作:

 SELECT o.fkey_style, h.name AS header, k.name AS kind
   , o.code, o.name AS option_name, o.description
     , count(*) OVER (PARTITION BY h.name) AS header_count
     , count(*) OVER (PARTITION BY h.name, k.name) AS header_kind_count
   FROM chrome_nvd.options o
   JOIN chrome_nvd.opt_header h ON h.id = o.fkey_opt_header
   JOIN chrome_nvd.opt_kind k ON k.id = o.fkey_opt_kind
  ORDER BY o.sequence;

The docs for Pg's Window function say:

The rows considered by a window function are those of the "virtual table" produced by the query's FROM clause as filtered by its WHERE, GROUP BY, and HAVING clauses if any. For example, a row removed because it does not meet the WHERE condition is not seen by any window function. A query can contain multiple window functions that slice up the data in different ways by means of different OVER clauses, but they all act on the same collection of rows defined by this virtual table.

However, I'm not seeing this. It seems to me like the Select Filter is very near to the left margin and the top (last thing done).

=# EXPLAIN SELECT * FROM chrome_nvd.view_options where fkey_style = 303451;
                                                      QUERY PLAN                                                      
----------------------------------------------------------------------------------------------------------------------
 Subquery Scan view_options  (cost=2098450.26..2142926.28 rows=14825 width=180)
   Filter: (view_options.fkey_style = 303451)
   ->  Sort  (cost=2098450.26..2105862.93 rows=2965068 width=189)
         Sort Key: o.sequence
         ->  WindowAgg  (cost=1446776.02..1506077.38 rows=2965068 width=189)
               ->  Sort  (cost=1446776.02..1454188.69 rows=2965068 width=189)
                     Sort Key: h.name, k.name
                     ->  WindowAgg  (cost=802514.45..854403.14 rows=2965068 width=189)
                           ->  Sort  (cost=802514.45..809927.12 rows=2965068 width=189)
                                 Sort Key: h.name
                                 ->  Hash Join  (cost=18.52..210141.57 rows=2965068 width=189)
                                       Hash Cond: (o.fkey_opt_header = h.id)
                                       ->  Hash Join  (cost=3.72..169357.09 rows=2965068 width=166)
                                             Hash Cond: (o.fkey_opt_kind = k.id)
                                             ->  Seq Scan on options o  (cost=0.00..128583.68 rows=2965068 width=156)
                                             ->  Hash  (cost=2.21..2.21 rows=121 width=18)
                                                   ->  Seq Scan on opt_kind k  (cost=0.00..2.21 rows=121 width=18)
                                       ->  Hash  (cost=8.80..8.80 rows=480 width=31)
                                             ->  Seq Scan on opt_header h  (cost=0.00..8.80 rows=480 width=31)
(19 rows)

These two WindowAgg's essentially change the plan to something that seems to never finish from the much faster

                                                                       QUERY PLAN                                                                       
--------------------------------------------------------------------------------------------------------------------------------------------------------
 Subquery Scan view_options  (cost=329.47..330.42 rows=76 width=164) (actual time=20.263..20.403 rows=42 loops=1)
   ->  Sort  (cost=329.47..329.66 rows=76 width=189) (actual time=20.258..20.300 rows=42 loops=1)
         Sort Key: o.sequence
         Sort Method:  quicksort  Memory: 35kB
         ->  Hash Join  (cost=18.52..327.10 rows=76 width=189) (actual time=19.427..19.961 rows=42 loops=1)
               Hash Cond: (o.fkey_opt_header = h.id)
               ->  Hash Join  (cost=3.72..311.25 rows=76 width=166) (actual time=17.679..18.085 rows=42 loops=1)
                     Hash Cond: (o.fkey_opt_kind = k.id)
                     ->  Index Scan using options_pkey on options o  (cost=0.00..306.48 rows=76 width=156) (actual time=17.152..17.410 rows=42 loops=1)
                           Index Cond: (fkey_style = 303451)
                     ->  Hash  (cost=2.21..2.21 rows=121 width=18) (actual time=0.432..0.432 rows=121 loops=1)
                           ->  Seq Scan on opt_kind k  (cost=0.00..2.21 rows=121 width=18) (actual time=0.042..0.196 rows=121 loops=1)
               ->  Hash  (cost=8.80..8.80 rows=480 width=31) (actual time=1.687..1.687 rows=480 loops=1)
                     ->  Seq Scan on opt_header h  (cost=0.00..8.80 rows=480 width=31) (actual time=0.030..0.748 rows=480 loops=1)
 Total runtime: 20.893 ms
(15 rows)

What is going on, and how do I fix it? I'm using Postgresql 8.4.8. Here is what the actual view is doing:

 SELECT o.fkey_style, h.name AS header, k.name AS kind
   , o.code, o.name AS option_name, o.description
     , count(*) OVER (PARTITION BY h.name) AS header_count
     , count(*) OVER (PARTITION BY h.name, k.name) AS header_kind_count
   FROM chrome_nvd.options o
   JOIN chrome_nvd.opt_header h ON h.id = o.fkey_opt_header
   JOIN chrome_nvd.opt_kind k ON k.id = o.fkey_opt_kind
  ORDER BY o.sequence;

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

瀟灑尐姊 2024-12-13 14:46:39

不,PostgreSQL 只会在没有聚合的 VIEW 上下推 WHERE 子句。 (窗口函数被视为聚合)。

< x>我认为这只是一个实现限制

<埃文·卡罗尔> x:我想知道必须采取什么措施来推动
在本例中,WHERE 子句被删除。

<埃文·卡罗尔>规划者必须知道 WindowAgg 本身不会增加选择性,因此将 WHERE 向下推是否安全?

< x>埃文·卡罗尔;我认为与计划者一起进行了很多非常复杂的工作

而且,

< a>埃文·卡罗尔:不。视图上的过滤条件适用于视图的输出,并且仅当视图不涉及聚合时才会被下推

No, PostgreSQL will only push down a WHERE clause on a VIEW that does not have an Aggregate. (Window functions are consider Aggregates).

< x> I think that's just an implementation limitation

< EvanCarroll> x: I wonder what would have to be done to push the
WHERE clause down in this case.

< EvanCarroll> the planner would have to know that the WindowAgg doesn't itself add selectivity and therefore it is safe to push the WHERE down?

< x> EvanCarroll; a lot of very complicated work with the planner, I'd presume

And,

< a> EvanCarroll: nope. a filter condition on a view applies to the output of the view and only gets pushed down if the view does not involve aggregates

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文