多分区Postgres表的高效查询
我刚刚重组了我的数据库以在 Postgres 8.2 中使用 分区 。现在我遇到了查询性能的问题:
SELECT *
FROM my_table
WHERE time_stamp >= '2010-02-10' and time_stamp < '2010-02-11'
ORDER BY id DESC
LIMIT 100;
表中有 4500 万行。在分区之前,这将使用反向索引扫描,并在达到限制时立即停止。
分区后(在 time_stamp 范围内),Postgres 对主表和相关分区进行完整索引扫描并合并结果,对它们进行排序,然后应用限制。这需要太长的时间。
我可以用以下方法修复它:
SELECT * FROM (
SELECT *
FROM my_table_part_a
WHERE time_stamp >= '2010-02-10' and time_stamp < '2010-02-11'
ORDER BY id DESC
LIMIT 100) t
UNION ALL
SELECT * FROM (
SELECT *
FROM my_table_part_b
WHERE time_stamp >= '2010-02-10' and time_stamp < '2010-02-11'
ORDER BY id DESC
LIMIT 100) t
UNION ALL
... and so on ...
ORDER BY id DESC
LIMIT 100
这运行得很快。时间戳超出范围的分区甚至不包含在查询计划中。
我的问题是:在 Postgres 8.2 中是否可以使用一些提示或语法来防止查询规划器扫描整个表,但仍然使用仅引用主表的简单语法?
基本上,我可以避免在当前定义的每个分区上动态构建大型 UNION 查询的痛苦吗?
编辑:我启用了constraint_exclusion(感谢@Vinko Vrsalovic)
I've just restructured my database to use partitioning in Postgres 8.2. Now I have a problem with query performance:
SELECT *
FROM my_table
WHERE time_stamp >= '2010-02-10' and time_stamp < '2010-02-11'
ORDER BY id DESC
LIMIT 100;
There are 45 million rows in the table. Prior to partitioning, this would use a reverse index scan and stop as soon as it hit the limit.
After partitioning (on time_stamp ranges), Postgres does a full index scan of the master table and the relevant partition and merges the results, sorts them, then applies the limit. This takes way too long.
I can fix it with:
SELECT * FROM (
SELECT *
FROM my_table_part_a
WHERE time_stamp >= '2010-02-10' and time_stamp < '2010-02-11'
ORDER BY id DESC
LIMIT 100) t
UNION ALL
SELECT * FROM (
SELECT *
FROM my_table_part_b
WHERE time_stamp >= '2010-02-10' and time_stamp < '2010-02-11'
ORDER BY id DESC
LIMIT 100) t
UNION ALL
... and so on ...
ORDER BY id DESC
LIMIT 100
This runs quickly. The partitions where the times-stamps are out-of-range aren't even included in the query plan.
My question is: Is there some hint or syntax I can use in Postgres 8.2 to prevent the query-planner from scanning the full table but still using simple syntax that only refers to the master table?
Basically, can I avoid the pain of dynamically building the big UNION query over each partition that happens to be currently defined?
EDIT: I have constraint_exclusion enabled (thanks @Vinko Vrsalovic)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
您是否尝试过约束排除(您链接到的文档中的第 5.9.4 节)
Have you tried Constraint Exclusion (section 5.9.4 in the document you've linked to)
我遇到了类似的问题,可以通过在 WHERE 中转换条件来解决。
EG:(假设time_stamp列是timestamptz类型)
另外,确保表上的CHECK条件以相同的方式定义......
例如:
检查(时间戳<'2010-02-10'::timestamptz)
I had a similar problem that I was able fix by casting conditions in WHERE.
EG: (assuming the time_stamp column is timestamptz type)
Also, make sure the CHECK condition on the table is defined the same way...
EG:
CHECK (time_stamp < '2010-02-10'::timestamptz)
我遇到了同样的问题,在我的情况下,它归结为两个原因:
我索引了
timestamp with time zone
类型的列,并通过该列的timestamp Without time 类型进行了分区约束
。修复所有子表的约束后,需要
ANALYZE
。编辑:另一点知识 - 重要的是要记住约束排除(允许 PG 根据您的分区标准跳过扫描某些表)不起作用,引用:
non-immutable诸如 CURRENT_TIMESTAMP
之类的函数我收到了
CURRENT_DATE
的请求,这是我的问题的一部分。I had the same problem and it boiled down to two reasons in my case:
I had indexed column of type
timestamp WITH time zone
and partition constraint by this column with typetimestamp WITHOUT time zone
.After fixing constraints
ANALYZE
of all child tables was needed.Edit: another bit of knowledge - it's important to remember that constraint exclusion (which allows PG to skip scanning some tables based on your partitioning criteria) doesn't work with, quote:
non-immutable function such as CURRENT_TIMESTAMP
I had requests with
CURRENT_DATE
and it was part of my problem.