提高 PostgreSQL 总体性能
提高 PostgreSQL 中执行 MAX(id) 聚合的查询速度的最佳方法是什么?
我有一定数量的与 id 相关的记录,我可以在一秒钟内对其进行 COUNT() 操作,例如,
select count(id) as cnt from mytable where ref_id=2660
row cnt
1 2844
但是,当我尝试使用 MAX() 查找最新的记录 id 时,查询需要近 5 分钟。
select max(id) as id from mytable where ref_id=2660
这是令人惊讶的,因为我发现 PG 在处理更复杂的查询时速度惊人地快。为什么查询时间会存在如此大的差异,特别是对于如此少量的记录?提高这种性能的最佳方法是什么?
编辑:这是上述 MAX() 选择的查询计划:
"Result (cost=219.84..219.85 rows=1 width=0)"
" InitPlan 1 (returns $0)"
" -> Limit (cost=0.00..219.84 rows=1 width=4)"
" -> Index Scan Backward using mytable_pkey on mytable (cost=0.00..773828.42 rows=3520 width=4)"
" Filter: ((id IS NOT NULL) AND (ref_id = 2660))"
What's the best way to increase the speed of a query in PostgreSQL that's performing a MAX(id) aggregation?
I have a modest number of records associated with an id, which I can COUNT() in a second e.g.
select count(id) as cnt from mytable where ref_id=2660
row cnt
1 2844
However, when I try and find the most recent record id using MAX(), the query takes nearly 5 minutes.
select max(id) as id from mytable where ref_id=2660
This is surprising, because I've otherwise found PG surprisingly fast with much more complicated queries. Why would there be such a difference in the query times, especially for such a relatively small number of records? What would be the best way to improve this performance?
EDIT: This is the query plan for the above MAX() select:
"Result (cost=219.84..219.85 rows=1 width=0)"
" InitPlan 1 (returns $0)"
" -> Limit (cost=0.00..219.84 rows=1 width=4)"
" -> Index Scan Backward using mytable_pkey on mytable (cost=0.00..773828.42 rows=3520 width=4)"
" Filter: ((id IS NOT NULL) AND (ref_id = 2660))"
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
我用谷歌搜索了一下,似乎 PostgreSQL(最高 8.4)不喜欢 MAX 和 MIN ,它对表进行顺序扫描以获得结果。如果没有查询计划和版本,很难说这是你的情况。
您可以尝试这个解决方法。
编辑:确保您有一个带有 (ref_id, id) 的索引,否则表扫描/排序是不可避免的。
I googled around, seems like PostgreSQL (up to 8.4) doesn't like
MAX
andMIN
, it does a sequential scan of the table to get the result. It's hard to say that it's your case without the query plan and the version.You can try this workaround.
Edit: Make sure you have an index with (ref_id, id), otherwise a table scan/sort is inevitable.
我正在使用 Postgres 8.4,并且可以说这可能是 Postgres 优化器中的一个错误,即不使用索引来进行涉及最小和最大聚合函数的查询。
更改我的查询后
从表中选择最大值(字段)到
按字段限制 1 从表顺序中选择字段
我的查询执行时间从 10 秒缩短到不到一秒。
当然,您可以为相关列定义索引,否则 postgres 将执行 seq_scan。
I am using Postgres 8.4 and can say it may be a bug in Postgres optimizer to not using indexes for queries envolving min and max agregation functions.
After changing my queries from
Select max(field) from table to
Select field from table order by field limit 1
my query execution time improved from 10s to less than a second.
Of course You might define and index for the column in question, otherwise postgres will do a seq_scan.