创建当前行日期的上个月滚动总计的列
上下文
使用Presto语法,我正在尝试创建一个输出表,该表的总计为一个月内每天的“金额”列值。在每一行中,还将有一个列,上个月滚动总数,还有一个列,总计之间有差异。
输出要求
- 完成:创建
month_to_date_amount
列存储从金额的总和
列。滚动总数的范围是第一个月和当前行日期
列值。重新启动滚动 每个月总数。
我已经在下面有一个工作查询,可以创建此列。
SELECT
*,
SUM(amount) OVER (
PARTITION BY
team,
month_id
ORDER BY
date ASC
) month_to_date_amount
FROM (
SELECT -- this subquery is required to handle duplicate dates
date,
SUM(amount) AS amount,
team,
month_id
FROM input_table
GROUP BY
date,
team,
month_id
) AS t
创建
Prev_month_to_date_amount
列:a。存储当前行的前几个月滚动金额
日期
和team
,并添加到同一 输出行。b。返回
0
如果没有记录匹配上个月日期。 (EX。3月31日的预级月日期为2月31日,因此不存在)。同样,没有金额
值的天数也将不存在。示例输出表如下。创建
移动
存储差异的列month_to_date_amount
列之间的金额Prev_month_to_date_amount
当前行的列。
问题
有人可以协助我上面的第二和第三条要求以达到下面所示的我所需的输出?通过添加到我当前的查询中,或者在必要时创建另一个更有效的查询。具有多个查询的解决方案很好。
输入表
团队 | 日期 | 量 | 月_id |
---|---|---|---|
A | 2022-04-01 | 1 | 2022-04 |
A | 2022-04-01 | 1 | 2022-04 |
A | 2022-04-02 | 1 | 2022-04 |
B | 2022-04 B 2022-04-01 | 3 | 2022-04 B 2022-04 |
B 2022-04 B | 2022-04 -02 | 3 | 2022-04 |
B | 2022-05-01 | 4 | 2022-05 |
B | 2022-05-02 | 4 | 2022-05 |
C | 2022-05-01 | 1 | 2022-05 |
C | 2022-05 C 2022-05-02-02 | 1 | 2022-05 |
C 2022-05 C | 2022-06 -01 | 5 | 2022-06 |
C | 2022-06-02 | 5 | 2022-06 |
Context
Using Presto syntax, I'm trying to create an output table that has rolling totals of an 'amount' column value for each day in a month. In each row there will also be a column with a rolling total for the previous month, and also a column with the difference between the totals.
Output Requirements
- completed: create
month_to_date_amount
column that stores rolling total from
sum ofamount
column. The range for the rolling total is between 1st of month and current rowdate
column value. Restart rolling
total each month.
I already have a working query below that creates this column.
SELECT
*,
SUM(amount) OVER (
PARTITION BY
team,
month_id
ORDER BY
date ASC
) month_to_date_amount
FROM (
SELECT -- this subquery is required to handle duplicate dates
date,
SUM(amount) AS amount,
team,
month_id
FROM input_table
GROUP BY
date,
team,
month_id
) AS t
create
prev_month_to_date_amount
column that:a. stores previous months rolling amount for the current rows
date
andteam
and add to same
output row.b. Return
0
if there is no record matching the previous month date. (Ex. Prev months date for March 31 is Feb 31 so does not exist). Also a record will not exist for days that have noamount
values. Example output table is below.create
movement
column that stores the difference
amount betweenmonth_to_date_amount
column andprev_month_to_date_amount
column from current row.
Question
Could someone assist with my 2nd and 3rd requirements above to achieve my desired output shown below? By either adding on to my current query above, or creating another more efficient one if necessary. A solution with multiple queries is fine.
Input Table
team | date | amount | month_id |
---|---|---|---|
A | 2022-04-01 | 1 | 2022-04 |
A | 2022-04-01 | 1 | 2022-04 |
A | 2022-04-02 | 1 | 2022-04 |
B | 2022-04-01 | 3 | 2022-04 |
B | 2022-04-02 | 3 | 2022-04 |
B | 2022-05-01 | 4 | 2022-05 |
B | 2022-05-02 | 4 | 2022-05 |
C | 2022-05-01 | 1 | 2022-05 |
C | 2022-05-02 | 1 | 2022-05 |
C | 2022-06-01 | 5 | 2022-06 |
C | 2022-06-02 | 5 | 2022-06 |
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
此答案是使用窗口函数滞后的一个很好的例子。总而言之,查询按团队和每月的数据分区分区,并使用滞后来获取前几个月的数量并计算运动价值。
例如,B数据团队数据。窗口函数将创建两个分区集:一个与团队B 01/04/2022和01/05/2022行,另一个与团队B 02/04/2022和02/05/2022行,订购每个分区集按日期。然后,对于每一行的每组,请使用滞后从上一行获取数据(如果存在)以启用运动并检索前几个月的数量。
我希望这会有所帮助。
This answer is a good example of using the window function LAG. In summary the query partitions the data by Team and Day of Month, and uses LAG to get the previous months amount and calculate the movement value.
e.g. for Team B data. The window function will create two partition sets: one with the Team B 01/04/2022 and 01/05/2022 rows, and one with the Team B 02/04/2022 and 02/05/2022 rows, order each partition set by date. Then for each set for each row, use LAG to get the data from the previous row (if one exists) to enable calculation of the movement and retrieve the previous months amount.
I hope this helps.