关于 SQL/Postgres 中聚合函数内部结构的问题
像 SUM 这样的函数是如何工作的?如果我执行
select id,sum(a) from mytable group by id
它是否按 id 排序,然后对每个相等 id 的范围求和?我不是规划专家,但看起来这就是正在发生的事情,其中 mytable 可能有一亿行,有几百万个不同的 ID。
或者它只是保留 id 的哈希值 -> current_sum,然后在每一行增加 id 的值或添加一个新键?这不是更快而且更少的内存消耗吗?
How does a function like SUM work? If I execute
select id,sum(a) from mytable group by id
does it sort by id and then sum over each range of equal id's? I am no planner expert, but it looks like that is what is happening, where mytable is maybe a hundred million rows with a few million distinct id's.
Or does it just keep a hash of id -> current_sum, and then at each row either increments the value of id or add a new key? Isn't that far faster and less memory hungry?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
SQL 标准试图规定外部行为,而不是内部行为。在这种特殊情况下,符合(众多)标准之一的 SQL 实现应该像按此顺序执行操作一样运行。
使用 FROM 子句中的所有表构造函数构建工作表。 (您的示例中只有一个。)
在 GROUP BY 子句中,将工作表分区为组。将每组减少为一行。将工作表替换为分组表。
解析 SELECT 子句中的表达式。
遵循 SQL 标准的查询优化器可以随意重新安排,只要结果与遵循这些步骤的结果相同即可。
您可以在 这个问题。
SQL standards try to dictate external behavior, not internal behavior. In this particular case, a SQL implementation that conforms to (one of the many) standards is supposed to act like it does things in this order.
Build a working table from all the table constructors in the FROM clause. (There's only one in your example.)
In the GROUP BY clause, partition the working table into groups. Reduce each group to one row. Replace the working table with the grouped table.
Resolve the expressions in the SELECT clause.
Query optimizers that follow SQL standards are free to rearrange things however they like, as long as the result is the same as if it had followed those steps.
You can find more details in the answers and comments to this SO question.
所以,我发现了这个, http://helmingstay.blogspot .com/2009/06/postgresql-poetry-aggregate-median-with.html,它声称它确实使用了累加器模式。嗯。
So, I found this, http://helmingstay.blogspot.com/2009/06/postgresql-poetry-aggregate-median-with.html, which claims that it does indeed use the accumulator pattern. Hmmm.