在同一分区上应用多个窗口函数
是否可以将多个窗口函数应用于同一分区? (如果我没有使用正确的词汇,请纠正我)
例如,您可以这样做
SELECT name, first_value() over (partition by name order by date) from table1
但是有没有办法做类似的事情:
SELECT name, (first_value() as f, last_value() as l (partition by name order by date)) from table1
我们在同一个窗口上应用两个函数?
参考: http://postgresql.ro/docs/8.4/static/tutorial-window.html
Is it possible to apply multiple window functions to the same partition? (Correct me if I'm not using the right vocabulary)
For example you can do
SELECT name, first_value() over (partition by name order by date) from table1
But is there a way to do something like:
SELECT name, (first_value() as f, last_value() as l (partition by name order by date)) from table1
Where we are applying two functions onto the same window?
Reference:
http://postgresql.ro/docs/8.4/static/tutorial-window.html
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
你能不能只使用每个选择的窗口
类似的东西
也从你的参考中你可以这样做
Can you not just use the window per selection
Something like
Also from your reference you can do it like this
警告:我不会删除此答案,因为它在技术上似乎是正确的,因此可能会有所帮助,但请注意
PARTITION BY bar ORDER BY foo
无论如何可能不是您想要做的。事实上,聚合函数不会将分区元素作为一个整体进行计算。也就是说,SELECT avg(foo) OVER (PARTITION BY bar ORDER BY foo)
不等于SELECT avg(foo) OVER (PARTITION BY bar)< /code>(参见答案末尾的证明)。
虽然它本身并不能提高性能,但如果您多次使用同一个分区,您可能想要 使用 astander 提出的第二种语法,不仅仅是因为它编写起来更便宜。这就是原因。
考虑以下查询:
由于原则上排序对平均值的计算没有影响,因此您可能会想改用以下查询(第二个分区上没有排序):
这是一个大错误,因为这需要更长的时间。证明:
现在,如果您意识到这个问题,您当然会在任何地方使用相同的分区。但是,当您有十次或更多相同的分区并且需要几天时间更新它时,很容易忘记在本身不需要它的分区上添加 ORDER BY 子句。
这里出现了
WINDOW
语法,它将防止您犯这样的粗心错误(当然,前提是您知道最好尽量减少不同窗口函数的数量)。以下内容与第一个查询严格等效(据我从EXPLAIN ANALYZE
得知):警告后更新:
我理解“
SELECT avg(foo) OVER (PARTITION BY bar ORDER BY foo)
不等于SELECT avg(foo) OVER (PARTITION BY bar)
”似乎有问题,所以这里是一个例子:Warning : I don't delete this answer since it seems technically correct and therefore may be helpful, but beware that
PARTITION BY bar ORDER BY foo
is probably not what you want to do anyway. Indeed, aggregate functions won't compute the partition elements as a whole. That is,SELECT avg(foo) OVER (PARTITION BY bar ORDER BY foo)
is not equivalent toSELECT avg(foo) OVER (PARTITION BY bar)
(see proof at the end of the answer).Though it doesn't improve performance per se, if you use multiple times the same partition, you probably want to use the second syntax proposed by astander, and not only because it's cheaper to write. Here is why.
Consider the following query :
Since in principle the ordering has no effect on the computation of the average, you might be tempted to use the following query instead (no ordering on the second partition) :
This is a big mistake, as it will take much longer. Proof :
Now, if you are aware of this issue, of course you will use the same partition everywhere. But when you have ten times or more the same partition and you are updating it over days, it is quite easy to forget to add the
ORDER BY
clause on a partition which doesn't need it by itself.Here comes the
WINDOW
syntax, which will prevent you from such careless mistakes (provided, of course, you're aware it's better to minimize the number of different window functions). The following is strictly equivalent (as far as I can tell fromEXPLAIN ANALYZE
) to the first query :Post-warning update :
I understand the statement that "
SELECT avg(foo) OVER (PARTITION BY bar ORDER BY foo)
is not equivalent toSELECT avg(foo) OVER (PARTITION BY bar)
" seems questionable, so here is an example :