&quot年同比使用窗口功能

发布于 2025-01-19 11:24:19 字数 2881 浏览 2 评论 0原文

我正在按年进行Year Year进行加入分析。我每年都会加入同一张桌子,但是由于我使用了另一个工具来构建我的SQL,它的“动态”不是那么“动态”。如果我可以通过窗口函数解决此问题,那就更好了。因此,任何建议都得到赞赏:D的

想法是按小时进行此操作。也就是说,我想比较销售,从2022-04-05小时8与2021-04-05小时8和2020-04-05小时8的销售额进行比较。

我的数据按小时汇总:

商店TimestampSales
12019 -04-05T08:00:00Z10000
12020-04-05T08:00:00Z12000
12021-04-05T08:00:00Z15000
12022-04-05T08:00:00Z20000
22019-04-05T08:00 :00z13000
22020-04-05T08:00:00 Z16000
22021-04-05T08:00:00:00 Z19000
22022-04-04-05T08:00:00 Z22000

值得效果。我添加了它们只是为了澄清:

存储TIMESTAMP_1SALES_1TIMESTAMP_2SALES_2TIMESTAMP_3SALES_3
12019-04-05T08:00:00Z100002020-04-04-05T08:00:00Z120002021-021-04-04-04-05T08:00:00 Z15000
2 2019 22019-2 201999 000 2204--04--04--04--04 - 05T08:00:00Z130002020-04-05T08:00:00Z160002021-04-05T08:00:00Z19000

有什么想法吗? 提前致谢

I'm doing year on year analysis by doing joins. I'm joining the same table for every year but since I'm using another tool to build my SQL its not so 'dynamic'. If I could solve this with window functions it would be better. So any suggestion is appreciated :D

The idea is to do this by hour. That is, I want to compare sales lets say from 2022-04-05 hour 8 with sales from 2021-04-05 hour 8 and from 2020-04-05 hour 8.

My data is aggregated by hour:

StoreTimestampSales
12019-04-05T08:00:00Z10000
12020-04-05T08:00:00Z12000
12021-04-05T08:00:00Z15000
12022-04-05T08:00:00Z20000
22019-04-05T08:00:00Z13000
22020-04-05T08:00:00Z16000
22021-04-05T08:00:00Z19000
22022-04-05T08:00:00Z22000

Desired result (order may begin with this year) timestamps are not needed. I added them just to clarify:

StoreTimestamp_1Sales_1Timestamp_2Sales_2Timestamp_3Sales_3
12019-04-05T08:00:00Z100002020-04-05T08:00:00Z120002021-04-05T08:00:00Z15000
22019-04-05T08:00:00Z130002020-04-05T08:00:00Z160002021-04-05T08:00:00Z19000

Any ideas?
Thanks in Advance

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

滥情哥ㄟ 2025-01-26 11:24:19

不是真正的答案,但如果您只有一天的时间,

SELECT 
    store
    ,hour(date)
    ,array_agg(object_construct(date::text, sales)) within group (order by date) as hour_history
FROM data_table
GROUP BY 1,2
ORDER BY 1,2;

则给出:

STOREHOUR(DATE)HOUR_HISTORY
18[ { "2019-04-05 08:00:00.000": 10000 }, { "2020-04-05 08 :00:00.000": 12000 }, { "2021-04-05 08:00:00.000": 15000 }, { "2022-04-05 08:00:00.000": 20000 } ]
28[ { "2019-04-05 08:00:00.000": 13000 }, { "2020-04-05 08:00:00.000": 16000 }, { "2021-04-05 08:00:00.000": 19000 }, { "2022-04-05 08:00:00.000": 22000 } ]

因此:

SELECT store
    ,hour_history[0].date::timestamp as Timestamp_1
    ,hour_history[0].sales::number as Sales_1
    ,hour_history[1].date::timestamp as Timestamp_2
    ,hour_history[1].sales::number as Sales_2
    ,hour_history[2].date::timestamp as Timestamp_3
    ,hour_history[2].sales::number as Sales_3
FROM (
SELECT 
    store
    ,hour(date)
    ,array_agg(object_construct('date', date::text, 'sales', sales)) within group (order by date) as hour_history
FROM data_table
GROUP BY 1,2
)
ORDER BY 1;

确实给出:

STORETIMESTAMP_1SALES_1TIMESTAMP_2SALES_2TIMESTAMP_3SALES_3
12019-04-05 08:00:00.00010,0002020-04-05 08:00:00.00012,0002021-04-05 08:00:00.00015,000
22019-04-05 08:00:00.00013,0002020-04-05 08:00:00.00016,0002021-04-05 08:00:00.00019,000

如果您有很多月、日、小时的数据,则此方法适用这内部循环:

SELECT 
    store
    ,month(date)
    ,day(date)
    ,hour(date)
    ,array_agg(object_construct('date', date::text, 'sales', sales)) within group (order by date) as hour_history
FROM data_table
GROUP BY 1,2,3,4

又名:

WITH data_table AS (
    SELECT * FROM VALUES
        (1,'2019-04-05T08:00:00Z'::timestamp,10000),
        (1,'2020-04-05T08:00:00Z'::timestamp,12000),
        (1,'2021-04-05T08:00:00Z'::timestamp,15000),
        (1,'2022-04-05T08:00:00Z'::timestamp,20000),

        (1,'2019-03-05T08:00:00Z'::timestamp,10001),
        (1,'2020-03-05T08:00:00Z'::timestamp,12001),
        (1,'2021-03-05T08:00:00Z'::timestamp,15001),
        (1,'2022-03-05T08:00:00Z'::timestamp,20001),

        (1,'2019-04-04T08:00:00Z'::timestamp,10002),
        (1,'2020-04-04T08:00:00Z'::timestamp,12002),
        (1,'2021-04-04T08:00:00Z'::timestamp,15002),
        (1,'2022-04-04T08:00:00Z'::timestamp,20002),

    
        (2,'2019-04-05T08:00:00Z'::timestamp,13000),
        (2,'2020-04-05T08:00:00Z'::timestamp,16000),
        (2,'2021-04-05T08:00:00Z'::timestamp,19000),
        (2,'2022-04-05T08:00:00Z'::timestamp,22000)
    t(store, date, sales)
)
SELECT store
    ,hour_history[0].date::timestamp as Timestamp_1
    ,hour_history[0].sales::number as Sales_1
    ,hour_history[1].date::timestamp as Timestamp_2
    ,hour_history[1].sales::number as Sales_2
    ,hour_history[2].date::timestamp as Timestamp_3
    ,hour_history[2].sales::number as Sales_3
FROM (
SELECT 
    store
    ,month(date)
    ,day(date)
    ,hour(date)
    ,array_agg(object_construct('date', date::text, 'sales', sales)) within group (order by date) as hour_history
FROM data_table
GROUP BY 1,2,3,4
)
ORDER BY 1;

给出:

STORETIMESTAMP_1SALES_1TIMESTAMP_2SALES_2TIMESTAMP_3SALES_3
12019-04-05 08:00:00.00010,0002020-04-05 08:00:00.00012,0002021-04-05 08:00:00.00015,000
12019-03-05 08:00:00.00010,0012020-03-05 08:00:00.00012,0012021-03-05 08:00:00.00015,001
12019-04-04 08:00:00.00010,0022020-04-04 08:00:00.00012,0022021-04-04 08:00:00.00015,002
22019-04-05 08:00:00.00013,0002020-04-05 08:00:00.00016,0002021-04-05 08:00:00.00019,000

您会注意到,您的示例有 4 年的数据,而您正在丢弃 2022 年的数据。

Not really your answer but if you only have one days worth of hours

SELECT 
    store
    ,hour(date)
    ,array_agg(object_construct(date::text, sales)) within group (order by date) as hour_history
FROM data_table
GROUP BY 1,2
ORDER BY 1,2;

gives:

STOREHOUR(DATE)HOUR_HISTORY
18[ { "2019-04-05 08:00:00.000": 10000 }, { "2020-04-05 08:00:00.000": 12000 }, { "2021-04-05 08:00:00.000": 15000 }, { "2022-04-05 08:00:00.000": 20000 } ]
28[ { "2019-04-05 08:00:00.000": 13000 }, { "2020-04-05 08:00:00.000": 16000 }, { "2021-04-05 08:00:00.000": 19000 }, { "2022-04-05 08:00:00.000": 22000 } ]

Thus:

SELECT store
    ,hour_history[0].date::timestamp as Timestamp_1
    ,hour_history[0].sales::number as Sales_1
    ,hour_history[1].date::timestamp as Timestamp_2
    ,hour_history[1].sales::number as Sales_2
    ,hour_history[2].date::timestamp as Timestamp_3
    ,hour_history[2].sales::number as Sales_3
FROM (
SELECT 
    store
    ,hour(date)
    ,array_agg(object_construct('date', date::text, 'sales', sales)) within group (order by date) as hour_history
FROM data_table
GROUP BY 1,2
)
ORDER BY 1;

does give:

STORETIMESTAMP_1SALES_1TIMESTAMP_2SALES_2TIMESTAMP_3SALES_3
12019-04-05 08:00:00.00010,0002020-04-05 08:00:00.00012,0002021-04-05 08:00:00.00015,000
22019-04-05 08:00:00.00013,0002020-04-05 08:00:00.00016,0002021-04-05 08:00:00.00019,000

If you have many months, day, hours worth of data this works for the inner loop:

SELECT 
    store
    ,month(date)
    ,day(date)
    ,hour(date)
    ,array_agg(object_construct('date', date::text, 'sales', sales)) within group (order by date) as hour_history
FROM data_table
GROUP BY 1,2,3,4

AKA:

WITH data_table AS (
    SELECT * FROM VALUES
        (1,'2019-04-05T08:00:00Z'::timestamp,10000),
        (1,'2020-04-05T08:00:00Z'::timestamp,12000),
        (1,'2021-04-05T08:00:00Z'::timestamp,15000),
        (1,'2022-04-05T08:00:00Z'::timestamp,20000),

        (1,'2019-03-05T08:00:00Z'::timestamp,10001),
        (1,'2020-03-05T08:00:00Z'::timestamp,12001),
        (1,'2021-03-05T08:00:00Z'::timestamp,15001),
        (1,'2022-03-05T08:00:00Z'::timestamp,20001),

        (1,'2019-04-04T08:00:00Z'::timestamp,10002),
        (1,'2020-04-04T08:00:00Z'::timestamp,12002),
        (1,'2021-04-04T08:00:00Z'::timestamp,15002),
        (1,'2022-04-04T08:00:00Z'::timestamp,20002),

    
        (2,'2019-04-05T08:00:00Z'::timestamp,13000),
        (2,'2020-04-05T08:00:00Z'::timestamp,16000),
        (2,'2021-04-05T08:00:00Z'::timestamp,19000),
        (2,'2022-04-05T08:00:00Z'::timestamp,22000)
    t(store, date, sales)
)
SELECT store
    ,hour_history[0].date::timestamp as Timestamp_1
    ,hour_history[0].sales::number as Sales_1
    ,hour_history[1].date::timestamp as Timestamp_2
    ,hour_history[1].sales::number as Sales_2
    ,hour_history[2].date::timestamp as Timestamp_3
    ,hour_history[2].sales::number as Sales_3
FROM (
SELECT 
    store
    ,month(date)
    ,day(date)
    ,hour(date)
    ,array_agg(object_construct('date', date::text, 'sales', sales)) within group (order by date) as hour_history
FROM data_table
GROUP BY 1,2,3,4
)
ORDER BY 1;

gives:

STORETIMESTAMP_1SALES_1TIMESTAMP_2SALES_2TIMESTAMP_3SALES_3
12019-04-05 08:00:00.00010,0002020-04-05 08:00:00.00012,0002021-04-05 08:00:00.00015,000
12019-03-05 08:00:00.00010,0012020-03-05 08:00:00.00012,0012021-03-05 08:00:00.00015,001
12019-04-04 08:00:00.00010,0022020-04-04 08:00:00.00012,0022021-04-04 08:00:00.00015,002
22019-04-05 08:00:00.00013,0002020-04-05 08:00:00.00016,0002021-04-05 08:00:00.00019,000

What you will note is you example has 4 years of data, and you are throwing away to 2022 data.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文