SQL:使用窗口功能每天应用每天的汇总结果
考虑一个包含三个字段的时间序列表time
type timestamptz ,balance
type numeric 和代码> IS_SPENT_COLUMN type text 的类型。
以下查询在给定间隔的最后一天生成了有效的结果。
SELECT
MAX(DATE_TRUNC('DAY', (time))) as last_day,
SUM(balance) FILTER ( WHERE is_spent_column is NULL ) AS value_at_last_day
FROM tbl
2010-07-12 18681.80077501749874141407984000
但是,我需要基于窗口功能的等效查询,以报告所有天数的列的总价值balance
包括给定日期。
这是我到目前为止尝试的,但是没有任何有效的结果:
SELECT
DATE_TRUNC('DAY', (time)) AS daily,
SUM(sum(balance) FILTER ( WHERE is_spent_column is NULL ) ) OVER ( ORDER BY DATE_TRUNC('DAY', (time)) ) AS total_value_per_day
FROM tbl
group by 1
order by 1 desc
2010-07-12 16050.496390449777568391974000
2010-07-11 13103.159119670350269890284000
2010-07-10 12594.525752964512456914454000
2010-07-09 12380.159588711091681327014000
2010-07-08 12178.11954253666668113577014000
2010-07-07 11995.943973804127033140014000
EDIT:
Here is a sample dataset: 链接删除了
运行总计可以通过在整个数据集中应用上面的第一个查询到所需的一天来计算。例如,对于Day 2009-01-31
,结果为97.13522530000000000000,或者对于Day 2009-01-15
,当我们将时间过滤为time&lt< '2009-01-16 00:00:00'
它返回24.44614400000000000000
。
我需要的是一个替代查询,该查询每天在一个查询中计算每天的运行总数。
编辑2:
非常感谢大家的参与和支持。
查询结果集差异的原因是在前面的ETL管道上。对不起,我的无知!
在下面,我提供了一个示例模式来测试查询。
现在,上面给出的两个查询和下面答案中给出的查询返回相同的结果。
Consider a time-series table that contains three fields time
of type timestamptz, balance
of type numeric, and is_spent_column
of type text.
The following query generates a valid result for the last day of the given interval.
SELECT
MAX(DATE_TRUNC('DAY', (time))) as last_day,
SUM(balance) FILTER ( WHERE is_spent_column is NULL ) AS value_at_last_day
FROM tbl
2010-07-12 18681.800775017498741407984000
However, I am in need of an equivalent query based on window functions to report the total value of the column named balance
for all the days up to and including the given date .
Here is what I've tried so far, but without any valid result:
SELECT
DATE_TRUNC('DAY', (time)) AS daily,
SUM(sum(balance) FILTER ( WHERE is_spent_column is NULL ) ) OVER ( ORDER BY DATE_TRUNC('DAY', (time)) ) AS total_value_per_day
FROM tbl
group by 1
order by 1 desc
2010-07-12 16050.496339044977568391974000
2010-07-11 13103.159119670350269890284000
2010-07-10 12594.525752964512456914454000
2010-07-09 12380.159588711091681327014000
2010-07-08 12178.119542536668113577014000
2010-07-07 11995.943973804127033140014000
EDIT:
Here is a sample dataset:
LINK REMOVED
The running total can be computed by applying the first query above on the entire dataset up to and including the desired day. For example, for day 2009-01-31
, the result is 97.13522530000000000000, or for day 2009-01-15
when we filter time as time < '2009-01-16 00:00:00'
it returns 24.446144000000000000
.
What I need is an alternative query that computes the running total for each day in a single query.
EDIT 2:
Thank you all so very much for your participation and support.
The reason for differences in result sets of the queries was on the preceding ETL pipelines. Sorry for my ignorance!
Below I've provided a sample schema to test the queries.
https://www.db-fiddle.com/f/veUiRauLs23s3WUfXQu3WE/2
Now both queries given above and the query given in the answer below return the same result.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
考虑通过窗口函数 将数据汇总到日级,请考虑计算总数。而且,由于您使用单个条件进行汇总,因此
filter
可以将基本转换为:Consider calculating running total via window function after aggregating data to day level. And since you aggregate with a single condition,
FILTER
condition can be converted to basicWHERE
: