为什么 Postgres 中的坏行估计速度很慢?
是什么让错误的行估计成为 SQL 查询性能的痛点?我很想知道其内部原因。
通常,错误的行估计实际上会选择正确的计划,而好查询和坏查询之间的唯一区别是估计的行数。
为什么经常出现如此巨大的性能差异?
是因为 Postgres 使用行估计来分配内存吗?
What makes bad row estimates a pain point in SQL query performance? I’m interested to know the internal reasons why.
Often a bad row estimate will actually pick the correct plan, and the only difference between a good query and a bad query will be the estimated row counts.
Why is there frequently such a massive performance difference?
Is it because Postgres uses row estimates to allocate memory?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
行计数估计会影响优化器的进一步决策,因此错误的估计可能会导致错误的计划。
根据我的经验,问题通常发生在决定正确的 连接策略:
当行数被低估时,PostgreSQL 可能会选择嵌套循环连接而不是散列或合并连接,但内表最终会比 PostgreSLQ 想象的更频繁地被扫描,从而导致坏的性能。
相反,如果 PostgreSQL 高估了行数,它可能会选择散列或合并联接并完全扫描两个表,这可能比在内表上进行几次索引扫描要慢得多。
The row count estimates influence further decisions by the optimizer, so a bad estimate can lead to a bad plan.
In my experience, the problem typically happens during the decision for the correct join strategy:
When row counts are underestimated, PostgreSQL may pick a nested loop join instead of a hash or a merge join, but the inner table ends up being scanned more often than PostgreSLQ thought, leading to bad performance.
Conversely, if PostgreSQL overestimates the row count, it may pick a hash or merge join and scan both table fully, which can be much slower than a couple of index scans on the inner table.
Postgresql优化器是一个基于成本的优化器(CBO),查询将按照执行计划中最小成本来执行,并且成本将通过表的统计来计算。
因为错误的统计可能会选择错误的执行计划。这是一个示例
有两个表,
T1
有 20000000 行,T2
有 1000000 行。当我们在表上进行
join
时,我们将得到一个可能使用Merge JOIN
的执行计划,但是当我更新很多行时,如下所示让 id 加
100000000
code> 当 id 小于1000000
时,我们再次使用相同的查询,它将使用
Merge JOIN
,但应该有另一个更好的选择而不是Merge加入
。如果您没有达到 autovacuum_analyze_threshold(
autovacuum_analyze_threshold
默认值为0.1
,这意味着我们需要创建超过10%
deadtuple postgresql 将更新统计信息自动)当我们使用手动
ANALYZE T1;
时,这意味着更新T1
表统计信息,然后再次查询该查询会得到比Merge JOIN
更好的Nested Loop
小结论:
表中精确的统计数据将帮助优化器通过精确的计算得到正确的执行计划表中的成本。
这是一个帮助我们搜索
last_analyze
& 的脚本。last_vacuum
最后一次。Postgresql optimizer is a cost-based optimizer (CBO), queries will be executed by the smallest cost from execution plans, and the cost will calculate by the statistic of the table.
Because the wrong statistic might choose a bad execution plan. Here is an example
There are two tables,
T1
has 20000000 rows,T2
has 1000000 rows.when we do
join
on tables we will get an execution plan which might useMerge JOIN
but when I update a lot of rows as below let id plus
100000000
when id smaller than1000000
we use the same query again, it will use
Merge JOIN
, but There should be another better option instead ofMerge JOIN
.if you didn't hit the autovacuum_analyze_threshold (
autovacuum_analyze_threshold
default value was0.1
that mean we need to create more than10%
deadtuple postgresql will update statistic automatically)when we use manual
ANALYZE T1;
which mean updateT1
table statistic, then query again the query will getNested Loop
which is better thanMerge JOIN
small conclusion:
A precise statistic in table will help the optimizer get the right execution plan by precise COST from tables.
Here is a script that helps us search
last_analyze
&last_vacuum
the last time.行计数估计用于计算不同计划的成本。
当这些估计发生时,计划的最终成本也会如此,这意味着最终会使用错误的计划。例如扫描表,因为它认为它需要表的重要部分,而实际上只需要几行,而使用索引可以更快地检索这些行。
The row count estimates are used to calculate the cost of different plans.
When those estimates are way of so will the resulting costs of the plans which will means it ends up using the wrong plan. Like for instance scanning a table because it thinks it needs a significant part of the table while in reality only a few rows are needed that could have been much faster retrieved using an index.