Postgres 按性能分区顺序

发布于 2024-11-14 13:58:34 字数 2148 浏览 5 评论 0原文

我正在使用分区的 postgres 表，遵循文档使用规则，使用基于日期范围的分区方案（我的日期列是纪元整数）

问题是，选择具有分片列最大值的行的简单查询没有使用索引：

首先，一些设置强制 postgres做我想做的事：设置约束排除=开；设置enable_seqscan = 关闭；

对单个分区的查询有效：

explain (SELECT * FROM urls_0 ORDER BY date_created ASC  LIMIT 1);
Limit  (cost=0.00..0.05 rows=1 width=38)
  ->  Index Scan using urls_date_created_idx_0 on urls_0  (cost=0.00..436.68 rows=8099 width=38)

但是，对整个表的相同查询是 seq 扫描：

explain (SELECT * FROM urls ORDER BY date_created ASC  LIMIT 1);
Limit  (cost=50000000274.88..50000000274.89 rows=1 width=51)
   ->  Sort  (cost=50000000274.88..50000000302.03 rows=10859 width=51)
         Sort Key: public.urls.date_created
         ->  Result  (cost=10000000000.00..50000000220.59 rows=10859 width=51)
               ->  Append  (cost=10000000000.00..50000000220.59 rows=10859 width=51)
                     ->  Seq Scan on urls  (cost=10000000000.00..10000000016.90 rows=690 width=88)
                     ->  Seq Scan on urls_15133 urls  (cost=10000000000.00..10000000016.90 rows=690 width=88)
                     ->  Seq Scan on urls_15132 urls  (cost=10000000000.00..10000000016.90 rows=690 width=88)
                     ->  Seq Scan on urls_15131 urls  (cost=10000000000.00..10000000016.90 rows=690 width=88)
                     ->  Seq Scan on urls_0 urls  (cost=10000000000.00..10000000152.99 rows=8099 width=38)

最后，通过 date_created 进行的查找确实可以在约束排除和索引扫描的情况下正常工作：

explain (SELECT * FROM urls where date_created = 1212)
Result  (cost=10000000000.00..10000000052.75 rows=23 width=45)
   ->  Append  (cost=10000000000.00..10000000052.75 rows=23 width=45)
         ->  Seq Scan on urls  (cost=10000000000.00..10000000018.62 rows=3 width=88)
               Filter: (date_created = 1212)
         ->  Index Scan using urls_date_created_idx_0 on urls_0 urls  (cost=0.00..34.12 rows=20 width=38)
               Index Cond: (date_created = 1212)

有谁知道如何使用分区以便这种类型的查询会使用索引扫描吗？

原文

I'm using a partitioned postgres table following the documentation using rules, using a partitioning scheme based on date ranges (my date column is an epoch integer)

The problem is that a simple query to select the row with the maximum value of the sharded column is not using indices:

First, some settings to coerce postgres to do what I want:
SET constraint_exclusion = on;
SET enable_seqscan = off;

The query on a single partition works:

explain (SELECT * FROM urls_0 ORDER BY date_created ASC  LIMIT 1);
Limit  (cost=0.00..0.05 rows=1 width=38)
  ->  Index Scan using urls_date_created_idx_0 on urls_0  (cost=0.00..436.68 rows=8099 width=38)

However, the same query on the entire table is seq scanning:

explain (SELECT * FROM urls ORDER BY date_created ASC  LIMIT 1);
Limit  (cost=50000000274.88..50000000274.89 rows=1 width=51)
   ->  Sort  (cost=50000000274.88..50000000302.03 rows=10859 width=51)
         Sort Key: public.urls.date_created
         ->  Result  (cost=10000000000.00..50000000220.59 rows=10859 width=51)
               ->  Append  (cost=10000000000.00..50000000220.59 rows=10859 width=51)
                     ->  Seq Scan on urls  (cost=10000000000.00..10000000016.90 rows=690 width=88)
                     ->  Seq Scan on urls_15133 urls  (cost=10000000000.00..10000000016.90 rows=690 width=88)
                     ->  Seq Scan on urls_15132 urls  (cost=10000000000.00..10000000016.90 rows=690 width=88)
                     ->  Seq Scan on urls_15131 urls  (cost=10000000000.00..10000000016.90 rows=690 width=88)
                     ->  Seq Scan on urls_0 urls  (cost=10000000000.00..10000000152.99 rows=8099 width=38)

Finally, a lookup by date_created does work correctly with contraint exclusions and index scans:

explain (SELECT * FROM urls where date_created = 1212)
Result  (cost=10000000000.00..10000000052.75 rows=23 width=45)
   ->  Append  (cost=10000000000.00..10000000052.75 rows=23 width=45)
         ->  Seq Scan on urls  (cost=10000000000.00..10000000018.62 rows=3 width=88)
               Filter: (date_created = 1212)
         ->  Index Scan using urls_date_created_idx_0 on urls_0 urls  (cost=0.00..34.12 rows=20 width=38)
               Index Cond: (date_created = 1212)

Does anyone know how to use partitioning so that this type of query will use an index scan?

分享到QQ

分享到微博