PostgreSQL组由包括零

发布于 2025-01-19 11:31:52 字数 3043 浏览 3 评论 0原文

我有一个 SQL 查询 (postgresql),看起来像这样:

SELECT
    my_timestamp::timestamp::date as the_date,
    count(*) as count
FROM my_table
WHERE ...
GROUP BY the_date
ORDER BY the_date

结果是一个由 YYYY-MM-DD, count 对组成的表。

现在我被要求用零填写空日期。因此,如果我之前提供了

2022-03-15    3
2022-03-17    1

我现在想要返回

2022-03-15    3
2022-03-16    0
2022-03-17    1

现在我可以轻松地执行此客户端(相对于数据库)并让我的程序根据 postgres 的原始列表计算并将零增强列表返回给其客户端。但如果我可以告诉 postgresql 包含零,也许会更好。

我怀疑这根本不容易,因为 postgres 没有明显的方法知道我在做什么。但为了更多地了解 postgres 和 SQL,我想我应该尝试一下。到目前为止,这个尝试还不太有希望......

在我得出结论认为我把这个留给我的(postgres客户端)程序是正确的之前,有什么指示吗?

更新

这是一个有趣的案例,我对问题的简化导致了一个对我不起作用的正确答案。对于后来的人,我认为值得记录接下来的内容,因为通过构建 SQL 查询需要一些有趣的曲折。

@a_horse_with_no_name 回复了一个查询,如果我简化自己的查询以进行匹配,我已经验证该查询是否有效。不幸的是,我的查询有一些我认为不相关的额外包袱,因此在发布原始问题时已被删除。

这是我真实的(原始)查询,保留了所有名称(如果缩短):

-- current query
SELECT
    LEAST(time1, time2, time3, time4)::timestamp::date as the_date,
    count(*) as count
FROM reading_group_reader rgr
INNER JOIN (  SELECT group_id, group_type  ::group_type_name
FROM (VALUES (31198, 'excerpt')) as T(group_id, group_type)) TT
ON TT.group_id = rgr.group_id
  AND TT.group_type = rgr.group_type
WHERE LEAST(time1, time2, time3, time4) >  current_date - 30
GROUP BY the_date
ORDER BY the_date;

但是,如果我将其直接转换为建议的解决方案,则 reading_group_reader 和临时表 TT< 之间的内部联接/code> 导致左连接变为内部(我认为)并且日期序列再次删除其零。 Fwiw,表 TT 是一个表,因为有时它实际上是一个子选择。

所以我将我的查询转换为这样:

SELECT
    g.dt::date as the_date,
    count(*) as count
FROM generate_series(date '2022-03-06', date '2022-04-06', interval '1 day') as g(dt)
LEFT JOIN (
    SELECT
        LEAST(rgr.time1, rgr.time2, rgr.time3, rgr.time4)::timestamp::date as the_date
    FROM reading_group_reader rgr
    INNER JOIN (
        SELECT group_id, group_type  ::group_type_name
        FROM (VALUES (31198, 'excerpt')) as T(group_id, group_type)) TT
    ON TT.group_id = rgr.group_id
      AND TT.group_type = rgr.group_type
) rgrt
ON rgrt.the_date = g.dt::date
GROUP BY g.dt
ORDER BY the_date;

但这在应该为 0 的地方输出 1,而不是 0。

然而,原因是因为我现在选择了每个日期,所以,当然,每个日期都有一个。我需要包含一个附加字段(将为 NULL)并对其进行计数。

所以这个查询最终达到了我想要的效果:

SELECT
    g.dt::date as the_date,
    count(rgrt.device_id) as count
FROM generate_series(date '2022-03-06', date '2022-04-06', interval '1 day') as g(dt)
LEFT JOIN (
    SELECT
        LEAST(rgr.time1, rgr.time2, rgr.time3, rgr.time4)::timestamp::date as the_date,
        rgr.device_id
    FROM reading_group_reader rgr
    INNER JOIN (
        SELECT group_id, group_type  ::group_type_name
        FROM (VALUES (31198, 'excerpt')) as T(group_id, group_type)
    ) TT
    ON TT.group_id = rgr.group_id
      AND TT.group_type = rgr.group_type
) rgrt(the_date)
ON rgrt.the_date = g.dt::date
GROUP BY g.dt
ORDER BY g.dt;

当然,在重新阅读已接受的答案时,我最终发现他确实计算了一个不相关的字段,而我在前几个字段中只是错过了它读数。

I have a SQL query (postgresql) that looks something like this:

SELECT
    my_timestamp::timestamp::date as the_date,
    count(*) as count
FROM my_table
WHERE ...
GROUP BY the_date
ORDER BY the_date

The result is a table of YYYY-MM-DD, count pairs.

Now I've been asked to fill in the empty dates with zero. So if I was previously providing

2022-03-15    3
2022-03-17    1

I'd now want to return

2022-03-15    3
2022-03-16    0
2022-03-17    1

Now I can easily do this client-side (relative to the database) and let my program compute and return the zero-augmented list to its clients based on the original list from postgres. But perhaps it would better if I could just tell postgresql to include zeros.

I suspect this isn't easy at all, because postgres has no obvious way of knowing what I'm up to. But in the interests of learning more about postgres and SQL, I thought I'd have try. The try isn't too promising thus far...

Any pointers before I conclude that I was right to leave this to my (postgres client) program?

Update

This is an interesting case where my simplification of the problem led to a correct answer that didn't work for me. For those who come after, I thought it worth documenting what followed, because it take some fun twists through constructing SQL queries.

@a_horse_with_no_name responded with a query that I've verified works if I simplify my own query to match. Unfortunately, my query had some extra baggage that I didn't think pertinent, and so had trimmed out when posting the original question.

Here's my real (original) query, with all names preserved (if shortened):

-- current query
SELECT
    LEAST(time1, time2, time3, time4)::timestamp::date as the_date,
    count(*) as count
FROM reading_group_reader rgr
INNER JOIN (  SELECT group_id, group_type  ::group_type_name
FROM (VALUES (31198, 'excerpt')) as T(group_id, group_type)) TT
ON TT.group_id = rgr.group_id
  AND TT.group_type = rgr.group_type
WHERE LEAST(time1, time2, time3, time4) >  current_date - 30
GROUP BY the_date
ORDER BY the_date;

If I translate that directly into the proposed solution, however, the inner join between reading_group_reader and the temporary table TT causes the left join to become inner (I think) and the date sequence drops its zeros again. Fwiw, the table TT is a table because sometimes it actually is a subselect.

So I transformed my query into this:

SELECT
    g.dt::date as the_date,
    count(*) as count
FROM generate_series(date '2022-03-06', date '2022-04-06', interval '1 day') as g(dt)
LEFT JOIN (
    SELECT
        LEAST(rgr.time1, rgr.time2, rgr.time3, rgr.time4)::timestamp::date as the_date
    FROM reading_group_reader rgr
    INNER JOIN (
        SELECT group_id, group_type  ::group_type_name
        FROM (VALUES (31198, 'excerpt')) as T(group_id, group_type)) TT
    ON TT.group_id = rgr.group_id
      AND TT.group_type = rgr.group_type
) rgrt
ON rgrt.the_date = g.dt::date
GROUP BY g.dt
ORDER BY the_date;

but this outputs 1's instead of 0's at the places that should be 0.

The reason for that, however, is because I've now selected every date, so, of course, there's one of each. I need to include an additional field (which will be NULL) and count that.

So this query finally does what I want:

SELECT
    g.dt::date as the_date,
    count(rgrt.device_id) as count
FROM generate_series(date '2022-03-06', date '2022-04-06', interval '1 day') as g(dt)
LEFT JOIN (
    SELECT
        LEAST(rgr.time1, rgr.time2, rgr.time3, rgr.time4)::timestamp::date as the_date,
        rgr.device_id
    FROM reading_group_reader rgr
    INNER JOIN (
        SELECT group_id, group_type  ::group_type_name
        FROM (VALUES (31198, 'excerpt')) as T(group_id, group_type)
    ) TT
    ON TT.group_id = rgr.group_id
      AND TT.group_type = rgr.group_type
) rgrt(the_date)
ON rgrt.the_date = g.dt::date
GROUP BY g.dt
ORDER BY g.dt;

And, of course, on re-reading the accepted answer, I eventually saw that he did count an unrelated field, which I'd simply missed on my first several readings.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

短叹 2025-01-26 11:31:52

您需要加入日期列表。例如,这可以使用generate_series() 来完成。

SELECT g.dt::date as the_date,
       count(t.my_timestamp) as count
FROM generate_series(date '2022-03-01', 
                     date '2022-03-31', 
                     interval '1 day') as g(dt)
  LEFT JOIN my_table as t 
        ON t.my_timestamp::date = g.dt::date
       AND ... -- the original WHERE clause goes here!
GROUP BY the_date
ORDER BY the_date;

请注意,原始 WHERE 条件需要进入LEFT JOIN 的连接条件。您不能将它们放入 WHERE 子句中,因为这会将外连接变回内连接(这意味着不会返回缺失的日期)。

You will need to join to a list of dates. This can e.g. be done using generate_series()

SELECT g.dt::date as the_date,
       count(t.my_timestamp) as count
FROM generate_series(date '2022-03-01', 
                     date '2022-03-31', 
                     interval '1 day') as g(dt)
  LEFT JOIN my_table as t 
        ON t.my_timestamp::date = g.dt::date
       AND ... -- the original WHERE clause goes here!
GROUP BY the_date
ORDER BY the_date;

Note that the original WHERE conditions need to go into the join condition of the LEFT JOIN. You can't put them into a WHERE clause because that would turn the outer join back into an inner join (which means the missing dates wouldn't be returned).

雪若未夕 2025-01-26 11:31:52

施放带有零以文本的整数字段对我解决了类似的问题。

Casting the integer field that carries the zeros to text solved a similar problem for me.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文