MySQL如何填充范围内缺失的日期?

发布于 2024-09-15 01:40:28 字数 725 浏览 12 评论 0原文

我有一个包含两列的表格,即日期和分数。它最多有 30 个条目,最近 30 天每天都有一个。

date      score
-----------------
1.8.2010  19
2.8.2010  21
4.8.2010  14
7.8.2010  10
10.8.2010 14

我的问题是缺少一些日期 - 我想看到:

date      score
-----------------
1.8.2010  19
2.8.2010  21
3.8.2010  0
4.8.2010  14
5.8.2010  0
6.8.2010  0
7.8.2010  10
...

我需要从单个查询中获取的是:19,21,9,14,0,0,10,0,0,14... 这意味着缺失的日期用 0 填充。

我知道如何获取所有值并使用服务器端语言迭代日期并缺少空格。但这是否可以在 mysql 中完成,以便我按日期对结果进行排序并获取丢失的部分。

编辑:在此表中还有另一列名为 UserID,因此我有 30.000 个用户,其中一些用户在此表中具有分数。如果日期<日期,我每天都会删除日期30 天前,因为我需要每个用户最近 30 天的得分。原因是我正在制作过去 30 天内用户活动的图表,并且为了绘制图表,我需要用逗号分隔的 30 个值。所以我可以说,在查询中获取 USERID=10203 活动,查询将获取 30 个分数,过去 30 天每天一个。我希望我现在更清楚了。

I have a table with 2 columns, date and score. It has at most 30 entries, for each of the last 30 days one.

date      score
-----------------
1.8.2010  19
2.8.2010  21
4.8.2010  14
7.8.2010  10
10.8.2010 14

My problem is that some dates are missing - I want to see:

date      score
-----------------
1.8.2010  19
2.8.2010  21
3.8.2010  0
4.8.2010  14
5.8.2010  0
6.8.2010  0
7.8.2010  10
...

What I need from the single query is to get: 19,21,9,14,0,0,10,0,0,14... That means that the missing dates are filled with 0.

I know how to get all the values and in server side language iterating through dates and missing the blanks. But is this possible to do in mysql, so that I sort the result by date and get the missing pieces.

EDIT: In this table there is another column named UserID, so I have 30.000 users and some of them have the score in this table. I delete the dates every day if date < 30 days ago because I need last 30 days score for each user. The reason is I am making a graph of the user activity over the last 30 days and to plot a chart I need the 30 values separated by comma. So I can say in query get me the USERID=10203 activity and the query would get me the 30 scores, one for each of the last 30 days. I hope I am more clear now.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(7

残花月 2024-09-22 01:40:28

MySQL 没有递归功能,因此您只能使用 NUMBERS 表技巧 -

  1. 创建一个仅保存递增数字的表 - 使用 auto_increment 很容易做到:

    如果存在`example`则删除表。`numbers`;
    创建表“示例”。“数字”(
      `id` int(10) 无符号 NOT NULL 自动增量,
       主键(`id`)
    ) 引擎=InnoDB 默认字符集=latin1;
    
  2. 使用以下方式填充表:

    插入 `example`.`numbers`
      (`id`)
    价值观
      ( 无效的 )
    

    ...您需要的任意数量的值。

  3. 使用DATE_ADD 构建日期列表,根据 NUMBERS.id 值增加天数。将“2010-06-06”和“2010-06-14”替换为您各自的开始日期和结束日期(但使用相同的格式,YYYY-MM-DD)-

    选择 `x`.*
      FROM (SELECT DATE_ADD('2010-06-06', INTERVAL `n`.`id` - 1 天)
              从“数字”“n”
             WHERE DATE_ADD('2010-06-06', INTERVAL `n`.`id` -1 DAY) <= '2010-06-14' ) x
    

  4. 根据以下内容 LEFT JOIN 到您的数据表时间部分:

     SELECT `x`.`ts` AS `timestamp`,
              COALESCE(`y`.`score`, 0) AS `cnt`
         FROM (SELECT DATE_FORMAT(DATE_ADD('2010-06-06', INTERVAL `n`.`id` - 1 DAY), '%m/%d/%Y') AS `ts`
                 从“数字”“n”
                WHERE DATE_ADD('2010-06-06', INTERVAL `n`.`id` - 1 天) <= '2010-06-14') x
    左连接表 `y` ON STR_TO_DATE(`y`.`date`, '%d.%m.%Y') = `x`.`ts`
    

如果要保留日期格式,请使用 DATE_FORMAT 函数

DATE_FORMAT(`x`.`ts`, '%d.%m.%Y') AS `timestamp`

MySQL doesn't have recursive functionality, so you're left with using the NUMBERS table trick -

  1. Create a table that only holds incrementing numbers - easy to do using an auto_increment:

    DROP TABLE IF EXISTS `example`.`numbers`;
    CREATE TABLE  `example`.`numbers` (
      `id` int(10) unsigned NOT NULL auto_increment,
       PRIMARY KEY  (`id`)
    ) ENGINE=InnoDB DEFAULT CHARSET=latin1;
    
  2. Populate the table using:

    INSERT INTO `example`.`numbers`
      ( `id` )
    VALUES
      ( NULL )
    

    ...for as many values as you need.

  3. Use DATE_ADD to construct a list of dates, increasing the days based on the NUMBERS.id value. Replace "2010-06-06" and "2010-06-14" with your respective start and end dates (but use the same format, YYYY-MM-DD) -

    SELECT `x`.*
      FROM (SELECT DATE_ADD('2010-06-06', INTERVAL `n`.`id` - 1 DAY)
              FROM `numbers` `n`
             WHERE DATE_ADD('2010-06-06', INTERVAL `n`.`id` -1 DAY) <= '2010-06-14' ) x
    
  4. LEFT JOIN onto your table of data based on the time portion:

       SELECT `x`.`ts` AS `timestamp`,
              COALESCE(`y`.`score`, 0) AS `cnt`
         FROM (SELECT DATE_FORMAT(DATE_ADD('2010-06-06', INTERVAL `n`.`id` - 1 DAY), '%m/%d/%Y') AS `ts`
                 FROM `numbers` `n`
                WHERE DATE_ADD('2010-06-06', INTERVAL `n`.`id` - 1 DAY) <= '2010-06-14') x
    LEFT JOIN TABLE `y` ON STR_TO_DATE(`y`.`date`, '%d.%m.%Y') = `x`.`ts`
    

If you want to maintain the date format, use the DATE_FORMAT function:

DATE_FORMAT(`x`.`ts`, '%d.%m.%Y') AS `timestamp`
悟红尘 2024-09-22 01:40:28

自从提出这个问题以来,时间已经过去了。 MySQL 8.0 于 2018 年发布,并添加了对 递归公共表表达式,它为这个问题提供了一个优雅的、最先进的解决方案。

以下查询可用于生成日期列表,例如 2010 年 8 月的前 15 天:

with recursive all_dates(dt) as (
    -- anchor
    select '2010-08-01' dt
    union all 
    -- recursion with stop condition
    select dt + interval 1 day from all_dates where dt < '2010-08-15'
)
select * from all_dates order by dt

然后您可以将此结果集与表左连接以生成预期输出:

with recursive all_dates(dt) as (
    select '2010-08-01' dt
    union all 
    select dt + interval 1 day from all_dates where dt < '2010-08-15'
)
select d.dt date, coalesce(t.score, 0) score
from all_dates d
left join mytable t on t.date = d.dt
order by d.dt

< a href="https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=2e1afb7a2286a9c0c219e3f31de2ef49" rel="noreferrer">DB Fiddle 上的演示

date       | score
:--------- | ----:
2010-08-01 |    19
2010-08-02 |    21
2010-08-03 |     0
2010-08-04 |    14
2010-08-05 |     0
2010-08-06 |     0
2010-08-07 |    10
2010-08-08 |     0
2010-08-09 |     0
2010-08-10 |    14
2010-08-11 |     0
2010-08-12 |     0
2010-08-13 |     0
2010-08-14 |     0
2010-08-15 |     0

请注意,它非常容易适应其他间隔或周期的递归 CTE。举个例子,假设我们希望在 2010 年 8 月 1 日凌晨 4 点到上午 8 点之间每 15 分钟进行一次行;我们可以做:

with recursive all_dates(dt) as (
    select '2010-08-01 04:00:00' dt
    union all 
    select dt + interval 15 minute from all_dates where dt < '2010-08-01 08:00:00'
)
...

Time went by since this question was asked. MySQL 8.0 was released in 2018 and added support for recursive common table expressions, which provide an elegant, state-of-the-art solution to this question.

The following query can be used to generate a list of dates, say for the first 15 days of August 2010:

with recursive all_dates(dt) as (
    -- anchor
    select '2010-08-01' dt
    union all 
    -- recursion with stop condition
    select dt + interval 1 day from all_dates where dt < '2010-08-15'
)
select * from all_dates order by dt

You can then left join this resultset with your table to generate the expected output:

with recursive all_dates(dt) as (
    select '2010-08-01' dt
    union all 
    select dt + interval 1 day from all_dates where dt < '2010-08-15'
)
select d.dt date, coalesce(t.score, 0) score
from all_dates d
left join mytable t on t.date = d.dt
order by d.dt

Demo on DB Fiddle:

date       | score
:--------- | ----:
2010-08-01 |    19
2010-08-02 |    21
2010-08-03 |     0
2010-08-04 |    14
2010-08-05 |     0
2010-08-06 |     0
2010-08-07 |    10
2010-08-08 |     0
2010-08-09 |     0
2010-08-10 |    14
2010-08-11 |     0
2010-08-12 |     0
2010-08-13 |     0
2010-08-14 |     0
2010-08-15 |     0

Note that it is very easy to adapt the recursive CTE for other intervals or periods. As an example, say we want a row every 15 minutes from 4 AM to 8 AM on August 1st, 2010 ; we can do :

with recursive all_dates(dt) as (
    select '2010-08-01 04:00:00' dt
    union all 
    select dt + interval 15 minute from all_dates where dt < '2010-08-01 08:00:00'
)
...
怪我闹别瞎闹 2024-09-22 01:40:28

我不喜欢其他答案,要求创建表格等。此查询无需辅助表即可高效完成。

SELECT 
    IF(score IS NULL, 0, score) AS score,
    b.Days AS date
FROM 
    (SELECT a.Days 
    FROM (
        SELECT curdate() - INTERVAL (a.a + (10 * b.a) + (100 * c.a)) DAY AS Days
        FROM       (SELECT 0 AS a UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) AS a
        CROSS JOIN (SELECT 0 AS a UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) AS b
        CROSS JOIN (SELECT 0 AS a UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) AS c
    ) a
    WHERE a.Days >= curdate() - INTERVAL 30 DAY) b
LEFT JOIN your_table
    ON date = b.Days
ORDER BY b.Days;

那么让我们来剖析一下。

SELECT 
    IF(score IS NULL, 0, score) AS score,
    b.Days AS date

if 将检测没有得分的天数并将其设置为 0。b.Days 是您选择从当前日期获取的配置天数,最多 1000。

    (SELECT a.Days 
    FROM (
        SELECT curdate() - INTERVAL (a.a + (10 * b.a) + (100 * c.a)) DAY AS Days
        FROM       (SELECT 0 AS a UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) AS a
        CROSS JOIN (SELECT 0 AS a UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) AS b
        CROSS JOIN (SELECT 0 AS a UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) AS c
    ) a
    WHERE a.Days >= curdate() - INTERVAL 30 DAY) b

这个子查询是我在 stackoverflow 上看到的。它有效地生成从当前日期算起过去 1000 天的列表。最后的 WHERE 子句中的间隔(当前为 30)决定返回哪些天;最大值为 1000。可以轻松修改此查询以返回 100 多年的日期,但 1000 对于大多数情况应该足够了。

LEFT JOIN your_table
    ON date = b.Days
ORDER BY b.Days;

这是将包含分数的表带入其中的部分。您可以与日期生成器查询中选定的日期范围进行比较,以便能够在需要的地方填充 0(分数最初将设置为 NULL,因为它是 LEFT JOIN >; 这是在 select 语句中修复的)。我也按日期排序,只是因为。这是偏好,你也可以按分数排序。

ORDER BY 之前,您可以轻松加入有关您在编辑中提到的用户信息的表,以添加最后一个要求。

我希望这个版本的查询对某人有所帮助。感谢您的阅读。

I'm not a fan of the other answers, requiring tables to be created and such. This query does it efficiently without helper tables.

SELECT 
    IF(score IS NULL, 0, score) AS score,
    b.Days AS date
FROM 
    (SELECT a.Days 
    FROM (
        SELECT curdate() - INTERVAL (a.a + (10 * b.a) + (100 * c.a)) DAY AS Days
        FROM       (SELECT 0 AS a UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) AS a
        CROSS JOIN (SELECT 0 AS a UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) AS b
        CROSS JOIN (SELECT 0 AS a UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) AS c
    ) a
    WHERE a.Days >= curdate() - INTERVAL 30 DAY) b
LEFT JOIN your_table
    ON date = b.Days
ORDER BY b.Days;

So lets dissect this.

SELECT 
    IF(score IS NULL, 0, score) AS score,
    b.Days AS date

The if will detect days that had no score and set them to 0. b.Days is the configured amount of days you chose to get from the current date, up to 1000.

    (SELECT a.Days 
    FROM (
        SELECT curdate() - INTERVAL (a.a + (10 * b.a) + (100 * c.a)) DAY AS Days
        FROM       (SELECT 0 AS a UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) AS a
        CROSS JOIN (SELECT 0 AS a UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) AS b
        CROSS JOIN (SELECT 0 AS a UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) AS c
    ) a
    WHERE a.Days >= curdate() - INTERVAL 30 DAY) b

This subquery is something I saw on stackoverflow. It efficiently generates a list of the past 1000 days from the current date. The interval (currently 30) in the WHERE clause at the end determines which days are returned; the maximum is 1000. This query could be easily modified to return 100s of years worth of dates, but 1000 should be good for most things.

LEFT JOIN your_table
    ON date = b.Days
ORDER BY b.Days;

This is the part that brings your table that contains the score into it. You compare to the selected date range from the date generator query to be able to fill in 0s where needed (the score will be set to NULL initially, because it is a LEFT JOIN; this is fixed in the select statement). I also order it by the dates, just because. This is preference, you could also order by score.

Before the ORDER BY you could easily join with your table about user info you mentioned with your edit, to add that last requirement.

I hope this version of the query helps someone. Thanks for reading.

独﹏钓一江月 2024-09-22 01:40:28

您可以使用日历表来完成此任务。这是一个您创建一次并填充日期范围的表(例如,2000-2050 年每天一个数据集;这取决于您的数据)。然后,您可以根据日历表对表进行外部联接。如果表中缺少日期,则分数返回 0。

You can accomplish this by using a Calendar Table. That's a table which you create once and fill with a date range (e.g. one dataset for each day 2000-2050; that depends on your data). Then you can make an outer join of your table against the calendar table. If a date is missing in your table, you return 0 for the score.

半岛未凉 2024-09-22 01:40:28

Michael Conard 的回答很好,但我需要 15 分钟的间隔,其中时间必须始终从每 15 分钟的顶部开始:

SELECT a.Days 
FROM (
    SELECT FROM_UNIXTIME( FLOOR( UNIX_TIMESTAMP() / (15 * 60) ) * (15 * 60)) - INTERVAL 15 * (a.a + (10 * b.a) + (100 * c.a)) MINUTE AS Days
    FROM       (SELECT 0 AS a UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) AS a
    CROSS JOIN (SELECT 0 AS a UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) AS b
    CROSS JOIN (SELECT 0 AS a UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) AS c
) a
WHERE a.Days >= curdate() - INTERVAL 30 DAY

这会将当前时间设置为上一轮的第 15 分钟:

FROM_UNIXTIME( FLOOR( UNIX_TIMESTAMP() / (15 * 60) ) * (15 * 60))

这将以 15 分钟为步长删除时间:

- INTERVAL 15 * (a.a + (10 * b.a) + (100 * c.a)) MINUTE

如果有更简单的方法,请告诉我。

Michael Conard answer is great but I needed intervals of 15 minutes where the time must always start at the top of every 15th minute:

SELECT a.Days 
FROM (
    SELECT FROM_UNIXTIME( FLOOR( UNIX_TIMESTAMP() / (15 * 60) ) * (15 * 60)) - INTERVAL 15 * (a.a + (10 * b.a) + (100 * c.a)) MINUTE AS Days
    FROM       (SELECT 0 AS a UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) AS a
    CROSS JOIN (SELECT 0 AS a UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) AS b
    CROSS JOIN (SELECT 0 AS a UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) AS c
) a
WHERE a.Days >= curdate() - INTERVAL 30 DAY

This will set the current time to the previous round 15th minute:

FROM_UNIXTIME( FLOOR( UNIX_TIMESTAMP() / (15 * 60) ) * (15 * 60))

And this will remove time with a 15 minute step:

- INTERVAL 15 * (a.a + (10 * b.a) + (100 * c.a)) MINUTE

If there's a simpler way to do it, please let me know.

苏璃陌 2024-09-22 01:40:28

您可以通过插入直接使用从开始日期到今天的用户

        with recursive all_dates(dt) as (
        -- anchor
        select '2021-01-01' dt
            union all 
        -- recursion with stop condition
        INSERT IGNORE  INTO mytable (date,score) VALUES (dt + interval 1 day ,0 )  where dt + interval 1 day <= curdate()
    )
    select * from all_dates

you can user direct from start date up to today with insertion

        with recursive all_dates(dt) as (
        -- anchor
        select '2021-01-01' dt
            union all 
        -- recursion with stop condition
        INSERT IGNORE  INTO mytable (date,score) VALUES (dt + interval 1 day ,0 )  where dt + interval 1 day <= curdate()
    )
    select * from all_dates
月亮坠入山谷 2024-09-22 01:40:28

如果您要将其与日历表连接起来,那么您可能可以使用它。
这可以帮助您生成表的最小日期和最大日期之间的所有缺失日期以及表中的其他列。

select c.calendar_date,fact.column3,fact.column4,fact.column5 
from calendar c 
join (SELECT min(your_date) as mindt, max(your_date) as maxdt,column3,column4,column5
from your_table         
group by 3,4,5
) fact
on c.calendar_date between fact.mindt and fact.maxdt

If you are joining this with calendar table then you can probably use this.
This may help you generate all the missing dates between min and max date of your table along with other columns in your table.

select c.calendar_date,fact.column3,fact.column4,fact.column5 
from calendar c 
join (SELECT min(your_date) as mindt, max(your_date) as maxdt,column3,column4,column5
from your_table         
group by 3,4,5
) fact
on c.calendar_date between fact.mindt and fact.maxdt
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文