如何改善GostGreSQL相交速度？

发布于 2025-01-20 07:57:11 字数 2409 浏览 0 评论 0原文

我在 postgres 中创建了下表：

CREATE TABLE mytable (id serial, c1 int, c2 int)

以下查询需要 0.9 毫秒才能完成：

select id from mytable where c1=555 and c2=444;

以下查询需要 6.5 毫秒才能完成：

select id from mytable where c1=555 intersect select id from mytable where c2=444;

第一个查询的查询计划是：

 Bitmap Heap Scan on mytable  (cost=46.90..54.84 rows=2 width=4) (actual time=0.623..0.626 rows=2 loops=1)
   Recheck Cond: ((c2 = 444) AND (c1 = 555))
   Heap Blocks: exact=2
   ->  BitmapAnd  (cost=46.90..46.90 rows=2 width=0) (actual time=0.609..0.610 rows=0 loops=1)
         ->  Bitmap Index Scan on i2  (cost=0.00..23.31 rows=1985 width=0) (actual time=0.267..0.267 rows=1978 loops=1)
               Index Cond: (c2 = 444)
         ->  Bitmap Index Scan on i1  (cost=0.00..23.33 rows=1987 width=0) (actual time=0.258..0.258 rows=1988 loops=1)
               Index Cond: (c1 = 555)

第二个计划是：

HashSetOp Intersect  (cost=23.81..10244.16 rows=1985 width=8) (actual time=6.784..6.804 rows=2 loops=1)
   ->  Append  (cost=23.81..10234.23 rows=3972 width=8) (actual time=0.543..6.093 rows=3966 loops=1)
         ->  Subquery Scan on "*SELECT* 2"  (cost=23.81..5106.08 rows=1985 width=8) (actual time=0.542..2.928 rows=1978 loops=1)
               ->  Bitmap Heap Scan on mytable  (cost=23.81..5086.23 rows=1985 width=4) (actual time=0.540..2.636 rows=1978 loops=1)
                     Recheck Cond: (c2 = 444)
                     Heap Blocks: exact=1810
                     ->  Bitmap Index Scan on i2  (cost=0.00..23.31 rows=1985 width=0) (actual time=0.331..0.331 rows=1978 loops=1)
                           Index Cond: (c2 = 444)
         ->  Subquery Scan on "*SELECT* 1"  (cost=23.83..5108.29 rows=1987 width=8) (actual time=0.537..2.790 rows=1988 loops=1)
               ->  Bitmap Heap Scan on mytable mytable_1  (cost=23.83..5088.42 rows=1987 width=4) (actual time=0.536..2.495 rows=1988 loops=1)
                     Recheck Cond: (c1 = 555)
                     Heap Blocks: exact=1812
                     ->  Bitmap Index Scan on i1  (cost=0.00..23.33 rows=1987 width=0) (actual time=0.340..0.340 rows=1988 loops=1)
                           Index Cond: (c1 = 555)

速度差异似乎是由于在第一次查询执行中使用了 BitmapAnd 。有没有办法让 postgres 对第二个查询也执行类似的执行？

原文

I have created the following table in postgres:

CREATE TABLE mytable (id serial, c1 int, c2 int)

The following query takes 0.9 ms to complete:

select id from mytable where c1=555 and c2=444;

The following query takes 6.5 ms to complete:

select id from mytable where c1=555 intersect select id from mytable where c2=444;

The query plan for the first query is:

 Bitmap Heap Scan on mytable  (cost=46.90..54.84 rows=2 width=4) (actual time=0.623..0.626 rows=2 loops=1)
   Recheck Cond: ((c2 = 444) AND (c1 = 555))
   Heap Blocks: exact=2
   ->  BitmapAnd  (cost=46.90..46.90 rows=2 width=0) (actual time=0.609..0.610 rows=0 loops=1)
         ->  Bitmap Index Scan on i2  (cost=0.00..23.31 rows=1985 width=0) (actual time=0.267..0.267 rows=1978 loops=1)
               Index Cond: (c2 = 444)
         ->  Bitmap Index Scan on i1  (cost=0.00..23.33 rows=1987 width=0) (actual time=0.258..0.258 rows=1988 loops=1)
               Index Cond: (c1 = 555)

and for the second plan is:

HashSetOp Intersect  (cost=23.81..10244.16 rows=1985 width=8) (actual time=6.784..6.804 rows=2 loops=1)
   ->  Append  (cost=23.81..10234.23 rows=3972 width=8) (actual time=0.543..6.093 rows=3966 loops=1)
         ->  Subquery Scan on "*SELECT* 2"  (cost=23.81..5106.08 rows=1985 width=8) (actual time=0.542..2.928 rows=1978 loops=1)
               ->  Bitmap Heap Scan on mytable  (cost=23.81..5086.23 rows=1985 width=4) (actual time=0.540..2.636 rows=1978 loops=1)
                     Recheck Cond: (c2 = 444)
                     Heap Blocks: exact=1810
                     ->  Bitmap Index Scan on i2  (cost=0.00..23.31 rows=1985 width=0) (actual time=0.331..0.331 rows=1978 loops=1)
                           Index Cond: (c2 = 444)
         ->  Subquery Scan on "*SELECT* 1"  (cost=23.83..5108.29 rows=1987 width=8) (actual time=0.537..2.790 rows=1988 loops=1)
               ->  Bitmap Heap Scan on mytable mytable_1  (cost=23.83..5088.42 rows=1987 width=4) (actual time=0.536..2.495 rows=1988 loops=1)
                     Recheck Cond: (c1 = 555)
                     Heap Blocks: exact=1812
                     ->  Bitmap Index Scan on i1  (cost=0.00..23.33 rows=1987 width=0) (actual time=0.340..0.340 rows=1988 loops=1)
                           Index Cond: (c1 = 555)

The difference in speed seems to be due to BitmapAnd being used in the first query execution. Is there anyway to make postgres do a similar execution for the second query as well ?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

要走干脆点 2025-01-27 07:57:11

进入共享缓冲区中已取消固定的页面非常慢。当然不是与从磁盘读取该页面相比，而是与您所做的任何其他事情相比。需要多个原子操作来确保并发进程在执行此操作时不会相互破坏，并且几乎肯定还会涉及 CPU 缓存未命中。您的 HashSetOp 需要比 BitmapAnd 执行更多操作（大约 1800 倍，根据您的查询计划编号，但这只是计算表页）。 BitmapAnd 与从本地内存中的索引读取的 ctids 相交，然后仅命中那些在相交中幸存的表，而 HashSetOp 需要先命中表才能进行相交，因为它是与 SQL 字段相交，而不是 ctids。

您可以通过使用仅索引扫描来改进错误查询。如果查询根本不需要访问表，因为它可以在索引中找到 SQL 字段，那么它需要访问的页面要少得多（索引页面的每个页面中塞满了许多相关的元组）

如果我在(c1, id) 和 (c2, id)，所以我让每个子查询都使用仅索引扫描，生成的 HashSetOp 计划比原来的快 3 到 4 倍原始HashSetOp没有这些索引的计划，同时仍然
比 BitmapAnd 慢 2 到 3 倍。这是在一个刚刚 VACUUMed 的表上，因此所有页面都标记为全部可见。