如何从 mysql 表创建一个期间日期范围,该表将列中的每个常见值序列分组

发布于 2024-09-07 15:29:16 字数 1374 浏览 6 评论 0原文

我的目标是返回列中具有相同值的开始日期和结束日期。这是我的桌子。 (*) 已被标记,让您了解我想要如何为 A 和 A 的每个相似序列值获取“EndDate”。 B 列

ID | DayDate   |  A  |  B
-----------------------------------------------
1  | 2010/07/1 | 200 |  300
2  | 2010/07/2 | 200 |  300 *
3  | 2010/07/3 | 150 |  250
4  | 2010/07/4 | 150 |  250 *
8  | 2010/07/5 | 150 |  350 *
9  | 2010/07/6 | 200 |  300
10 | 2010/07/7 | 200 |  300 *
11 | 2010/07/8 | 100 |  200
12 | 2010/07/9 | 100 |  200 *

,我想从上表中获取以下结果表

| DayDate   |EndDate   |  A  |  B
-----------------------------------------------
| 2010/07/1 |2010/07/2 | 200 |  300
| 2010/07/3 |2010/07/4 | 150 |  250
| 2010/07/5 |2010/07/5 | 150 |  350
| 2010/07/6 |2010/07/7 | 200 |  300
| 2010/07/8 |2010/07/9 | 100 |  200

更新:

谢谢迈克,您的方法似乎在您将下一行视为错误的角度起作用。

8  | 2010/07/5 | 150 |  350 * 

但这并不是一个错误。我面对此类数据的挑战就像记录市场价格随日期变化的场景。 mycase 中真正的问题是选择具有开始日期和结束日期的所有行(如果 A 和 A 都为)。 B 匹配所有这些行。还要选择先前选择的旁边的行,依此类推,表中不会遗漏任何数据。

我可以解释一个现实世界的场景。一家拥有 A 室和 B 室的酒店按照我的问题中的说明将每天的房价输入到表格中。现在,酒店需要获取一份报告,以使用开始日期和结束日期以更短的方式显示价格日历,而不是列出所有输入的日期。例如,2010/07/01 至 2010/07/02,A 的价格为 200,B 为 300。此价格从 3 日更改为 4 日,5 日仅当日 B 房间的价格有所不同。价格更改为 350。因此,这被视为单日差异,这就是开始日期和结束日期相同的原因。

我希望这能解释问题的情况。另请注意,这家酒店可能会在特定时间段内关闭,可以说这是我第一个问题的另一个问题。问题是,如果未在特定日期输入价格,例如周日,酒店不出售这两个房间,因此他们没有输入价格,这意味着表中不会存在该行。

My goal is to return a start and end date having same value in a column. Here is my table. The (*) have been marked to give you the idea of how I want to get "EndDate" for every similar sequence value of A & B columns

ID | DayDate   |  A  |  B
-----------------------------------------------
1  | 2010/07/1 | 200 |  300
2  | 2010/07/2 | 200 |  300 *
3  | 2010/07/3 | 150 |  250
4  | 2010/07/4 | 150 |  250 *
8  | 2010/07/5 | 150 |  350 *
9  | 2010/07/6 | 200 |  300
10 | 2010/07/7 | 200 |  300 *
11 | 2010/07/8 | 100 |  200
12 | 2010/07/9 | 100 |  200 *

and I want to get the following result table from the above table

| DayDate   |EndDate   |  A  |  B
-----------------------------------------------
| 2010/07/1 |2010/07/2 | 200 |  300
| 2010/07/3 |2010/07/4 | 150 |  250
| 2010/07/5 |2010/07/5 | 150 |  350
| 2010/07/6 |2010/07/7 | 200 |  300
| 2010/07/8 |2010/07/9 | 100 |  200

UPDATE:

Thanks Mike, The approach of yours seems to work in your perspective of considering the following row as a mistake.

8  | 2010/07/5 | 150 |  350 * 

However it is not a mistake. The challenge I am faced with this type of data is like a scenario of logging a market price change with date. The real problem in mycase is to select all rows with the beginning and ending date if both A & B matches in all these rows. Also to select the rows which are next to previously selected, and so on like that no data is left out in the table.

I can explain a real world scenario. A Hotel with Room A and B has room rates for each day entered in to table as explained in my question. Now the hotel needs to get a report to show the price calendar in a shorter way using start and end date, instead of listing all the dates entered. For example, on 2010/07/01 to 2010/07/02 the price of A is 200 and B is 300. This price is changed from 3rd to 4th and on 5th there is a different price only for that day where the Room B is price is changed to 350. So this is considered as a single day difference, thats why start and end dates are same.

I hope this explained the scenario of the problem. Also note that this hotel may be closed for a specific time period, lets say this is an additional problem to my first question. The problem is what if the rate is not entered on specific dates, for example on Sundays the hotel do not sell these two rooms so they entered no price, meaning the row will not exist in the table.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

就此别过 2024-09-14 15:29:16

创建相关表可以让您更自由地查询和提取相关信息。以下是一些您可能会觉得有用的链接:

您可以从这些教程开始:
http://dev.mysql.com/tech-resources/ articles/intro-to-normalization.html
http://net.tutsplus.com/tutorials/databases/sql-for -beginners/

stackoverflow 上还有几个可能有用的问题:
用简单英语进行规范化
数据库规范化到底有什么作用?

无论如何,我们来看看可能的解决方案。以下示例使用您的酒店房间进行类比。

首先,创建一个表来保存有关酒店房间的信息。该表仅包含房间 ID 及其名称,但您可以在此处存储其他信息,例如房间类型(单人间、双人间、双床间)、其景观(海滨、海景、城市景观、泳池景观)以及等等:

CREATE TABLE `room` (
  `id` INT UNSIGNED NOT NULL AUTO_INCREMENT,
  `name` VARCHAR(45) NOT NULL,
  PRIMARY KEY (`id`),
  UNIQUE INDEX `name_UNIQUE` (`name` ASC) )
ENGINE = InnoDB;

现在创建一个表来保存更衣室价格。该表通过 room_id 列链接到 room 表。外键约束防止将记录插入到 rate 表中,该表引用不存在的房间:

CREATE TABLE `rate` (
  `id` INT UNSIGNED NOT NULL AUTO_INCREMENT ,
  `room_id` INT UNSIGNED NOT NULL,
  `date` DATE NOT NULL,
  `rate` DECIMAL(6,2) UNSIGNED NOT NULL,
  PRIMARY KEY (`id`),
  INDEX `fk_room_rate` (`room_id` ASC),
  CONSTRAINT `fk_room_rate`
    FOREIGN KEY (`room_id` )
    REFERENCES `room` (`id` )
    ON DELETE CASCADE
    ON UPDATE CASCADE)
ENGINE = InnoDB;

创建两个房间,并添加有关每个房间的一些每日价格信息:

INSERT INTO `room` (`id`, `name`) VALUES (1, 'A'), (2, 'B');

INSERT INTO `rate` (`id`, `room_id`, `date`, `rate`) VALUES
( 1, 1, '2010-07-01', 200),
( 2, 1, '2010-07-02', 200),
( 3, 1, '2010-07-03', 150),
( 4, 1, '2010-07-04', 150),
( 5, 1, '2010-07-05', 150),
( 6, 1, '2010-07-06', 200),
( 7, 1, '2010-07-07', 200),
( 8, 1, '2010-07-08', 100),
( 9, 1, '2010-07-09', 100),
(10, 2, '2010-07-01', 300),
(11, 2, '2010-07-02', 300),
(12, 2, '2010-07-03', 250),
(13, 2, '2010-07-04', 250),
(14, 2, '2010-07-05', 350),
(15, 2, '2010-07-06', 300),
(16, 2, '2010-07-07', 300),
(17, 2, '2010-07-08', 200),
(18, 2, '2010-07-09', 200);

存储该信息后,可以使用简单的 <使用 JOIN 的 code>SELECT 查询将显示所有每日房价:

SELECT
    room.name,
    rate.date,
    rate.rate
FROM room
JOIN rate
ON rate.room_id = room.id;

+------+------------+--------+
| A    | 2010-07-01 | 200.00 |
| A    | 2010-07-02 | 200.00 |
| A    | 2010-07-03 | 150.00 |
| A    | 2010-07-04 | 150.00 |
| A    | 2010-07-05 | 150.00 |
| A    | 2010-07-06 | 200.00 |
| A    | 2010-07-07 | 200.00 |
| A    | 2010-07-08 | 100.00 |
| A    | 2010-07-09 | 100.00 |
| B    | 2010-07-01 | 300.00 |
| B    | 2010-07-02 | 300.00 |
| B    | 2010-07-03 | 250.00 |
| B    | 2010-07-04 | 250.00 |
| B    | 2010-07-05 | 350.00 |
| B    | 2010-07-06 | 300.00 |
| B    | 2010-07-07 | 300.00 |
| B    | 2010-07-08 | 200.00 |
| B    | 2010-07-09 | 200.00 |
+------+------------+--------+

要查找每个房价的开始和结束日期,您需要更复杂的查询:

SELECT 
    id,
    room_id,
    MIN(date) AS start_date,
    MAX(date) AS end_date,
    COUNT(*) AS days,
    rate
FROM (
    SELECT
        id,
        room_id,
        date,
        rate, 
        (
            SELECT COUNT(*)
            FROM rate AS b
            WHERE b.rate <> a.rate
            AND b.date <= a.date
            AND b.room_id = a.room_id
        ) AS grouping
    FROM rate AS a
    ORDER BY a.room_id, a.date
) c
GROUP BY rate, grouping
ORDER BY room_id, MIN(date);

+----+---------+------------+------------+------+--------+
| id | room_id | start_date | end_date   | days | rate   |
+----+---------+------------+------------+------+--------+
|  1 |       1 | 2010-07-01 | 2010-07-02 |    2 | 200.00 |
|  3 |       1 | 2010-07-03 | 2010-07-05 |    3 | 150.00 |
|  6 |       1 | 2010-07-06 | 2010-07-07 |    2 | 200.00 |
|  8 |       1 | 2010-07-08 | 2010-07-09 |    2 | 100.00 |
| 10 |       2 | 2010-07-01 | 2010-07-02 |    2 | 300.00 |
| 12 |       2 | 2010-07-03 | 2010-07-04 |    2 | 250.00 |
| 14 |       2 | 2010-07-05 | 2010-07-05 |    1 | 350.00 |
| 15 |       2 | 2010-07-06 | 2010-07-07 |    2 | 300.00 |
| 17 |       2 | 2010-07-08 | 2010-07-09 |    2 | 200.00 |
+----+---------+------------+------------+------+--------+

您可以找到这里对上述查询中使用的技术有一个很好的解释:
http://www.sqlteam.com/article/检测数据中的跑动或条纹

Creating related tables allows you much greater freedom to query and extract relevant information. Here's a few links that you might find useful:

You could start with these tutorials:
http://dev.mysql.com/tech-resources/articles/intro-to-normalization.html
http://net.tutsplus.com/tutorials/databases/sql-for-beginners/

There are also a couple of questions here on stackoverflow that might be useful:
Normalization in plain English
What exactly does database normalization do?

Anyway, on to a possible solution. The following examples use your hotel rooms analogy.

First, create a table to hold information about the hotel rooms. This table just contains the room ID and its name, but you could store other information in here, such as the room type (single, double, twin), its view (ocean front, ocean view, city view, pool view), and so on:

CREATE TABLE `room` (
  `id` INT UNSIGNED NOT NULL AUTO_INCREMENT,
  `name` VARCHAR(45) NOT NULL,
  PRIMARY KEY (`id`),
  UNIQUE INDEX `name_UNIQUE` (`name` ASC) )
ENGINE = InnoDB;

Now create a table to hold the changing room rates. This table links to the room table through the room_id column. The foreign key constraint prevents records being inserted into the rate table which refer to rooms that do not exist:

CREATE TABLE `rate` (
  `id` INT UNSIGNED NOT NULL AUTO_INCREMENT ,
  `room_id` INT UNSIGNED NOT NULL,
  `date` DATE NOT NULL,
  `rate` DECIMAL(6,2) UNSIGNED NOT NULL,
  PRIMARY KEY (`id`),
  INDEX `fk_room_rate` (`room_id` ASC),
  CONSTRAINT `fk_room_rate`
    FOREIGN KEY (`room_id` )
    REFERENCES `room` (`id` )
    ON DELETE CASCADE
    ON UPDATE CASCADE)
ENGINE = InnoDB;

Create two rooms, and add some daily rate information about each room:

INSERT INTO `room` (`id`, `name`) VALUES (1, 'A'), (2, 'B');

INSERT INTO `rate` (`id`, `room_id`, `date`, `rate`) VALUES
( 1, 1, '2010-07-01', 200),
( 2, 1, '2010-07-02', 200),
( 3, 1, '2010-07-03', 150),
( 4, 1, '2010-07-04', 150),
( 5, 1, '2010-07-05', 150),
( 6, 1, '2010-07-06', 200),
( 7, 1, '2010-07-07', 200),
( 8, 1, '2010-07-08', 100),
( 9, 1, '2010-07-09', 100),
(10, 2, '2010-07-01', 300),
(11, 2, '2010-07-02', 300),
(12, 2, '2010-07-03', 250),
(13, 2, '2010-07-04', 250),
(14, 2, '2010-07-05', 350),
(15, 2, '2010-07-06', 300),
(16, 2, '2010-07-07', 300),
(17, 2, '2010-07-08', 200),
(18, 2, '2010-07-09', 200);

With that information stored, a simple SELECT query with a JOIN will show you the all the daily room rates:

SELECT
    room.name,
    rate.date,
    rate.rate
FROM room
JOIN rate
ON rate.room_id = room.id;

+------+------------+--------+
| A    | 2010-07-01 | 200.00 |
| A    | 2010-07-02 | 200.00 |
| A    | 2010-07-03 | 150.00 |
| A    | 2010-07-04 | 150.00 |
| A    | 2010-07-05 | 150.00 |
| A    | 2010-07-06 | 200.00 |
| A    | 2010-07-07 | 200.00 |
| A    | 2010-07-08 | 100.00 |
| A    | 2010-07-09 | 100.00 |
| B    | 2010-07-01 | 300.00 |
| B    | 2010-07-02 | 300.00 |
| B    | 2010-07-03 | 250.00 |
| B    | 2010-07-04 | 250.00 |
| B    | 2010-07-05 | 350.00 |
| B    | 2010-07-06 | 300.00 |
| B    | 2010-07-07 | 300.00 |
| B    | 2010-07-08 | 200.00 |
| B    | 2010-07-09 | 200.00 |
+------+------------+--------+

To find the start and end dates for each room rate, you need a more complex query:

SELECT 
    id,
    room_id,
    MIN(date) AS start_date,
    MAX(date) AS end_date,
    COUNT(*) AS days,
    rate
FROM (
    SELECT
        id,
        room_id,
        date,
        rate, 
        (
            SELECT COUNT(*)
            FROM rate AS b
            WHERE b.rate <> a.rate
            AND b.date <= a.date
            AND b.room_id = a.room_id
        ) AS grouping
    FROM rate AS a
    ORDER BY a.room_id, a.date
) c
GROUP BY rate, grouping
ORDER BY room_id, MIN(date);

+----+---------+------------+------------+------+--------+
| id | room_id | start_date | end_date   | days | rate   |
+----+---------+------------+------------+------+--------+
|  1 |       1 | 2010-07-01 | 2010-07-02 |    2 | 200.00 |
|  3 |       1 | 2010-07-03 | 2010-07-05 |    3 | 150.00 |
|  6 |       1 | 2010-07-06 | 2010-07-07 |    2 | 200.00 |
|  8 |       1 | 2010-07-08 | 2010-07-09 |    2 | 100.00 |
| 10 |       2 | 2010-07-01 | 2010-07-02 |    2 | 300.00 |
| 12 |       2 | 2010-07-03 | 2010-07-04 |    2 | 250.00 |
| 14 |       2 | 2010-07-05 | 2010-07-05 |    1 | 350.00 |
| 15 |       2 | 2010-07-06 | 2010-07-07 |    2 | 300.00 |
| 17 |       2 | 2010-07-08 | 2010-07-09 |    2 | 200.00 |
+----+---------+------------+------------+------+--------+

You can find a good explanation of the technique used in the above query here:
http://www.sqlteam.com/article/detecting-runs-or-streaks-in-your-data

恏ㄋ傷疤忘ㄋ疼 2024-09-14 15:29:16
  • 我的一般方法是根据 DayDate = DayDate+1 并且 A 或 B 值不相等将表连接到自身
  • 第二天的值将有所不同)
  • 这将找到每个期间的结束日期(其中 唯一的问题是,找不到最后一个时期的结束日期。为了解决这个问题,我从表中选择最大日期并将其合并到我的结束日期列表中
  • 一旦定义了结束日期列表,您可以根据结束日期大于或等于将它们连接到原始表到原始日期
  • 从此最终列表中,选择按其他字段分组的最小日期

    <前><代码>选择
    min(DayDate) 作为 DayDate,EndDate,A,B 来自
    (SELECT DayDate, A, B, min(ends.EndDate) 作为 EndDate
    来自您的餐桌
    左连接
    (从您的表 UNION 中选择 max(DayDate) 作为 EndDate
    选择 t1.DayDate 作为结束日期
    来自您的表 t1
    加入你的表 t2
    ON date_add(t1.DayDate, INTERVAL 1 DAY) = t2.DayDate
    AND (t1.A>>t2.A OR t1.B>>t2.B)) 结束
    ON 结束.EndDate>=DayDate
    按日期分组,A,B) x
    按结束日期、A、B 分组

  • My general approach is to join the table onto itself based on DayDate = DayDate+1 and the A or B values not being equal
  • This will find the end dates for each period (where the value is going to be different on the following day)
  • The only problem is, that won't find an end date for the final period. To get around this, I selct the max date from the table and union that into my list of end dates
  • Once you have the list of end dates defined, you can join them to the original table based on the end date being greater than or equal to the original date
  • From this final list, select the minimum daydate grouped by the other fields

    select
    min(DayDate) as DayDate,EndDate,A,B from
    (SELECT DayDate, A, B, min(ends.EndDate) as EndDate
    FROM yourtable
    LEFT JOIN
    (SELECT max(DayDate) as EndDate FROM yourtable UNION
    SELECT t1.DayDate as EndDate 
    FROM yourtable t1
    JOIN yourtable t2
    ON date_add(t1.DayDate, INTERVAL 1 DAY) = t2.DayDate 
    AND (t1.A<>t2.A OR t1.B<>t2.B)) ends
    ON ends.EndDate>=DayDate
    GROUP BY DayDate, A, B) x
    GROUP BY EndDate,A,B
    
执笔绘流年 2024-09-14 15:29:16

我想我已经找到了一个可以生成所需表格的解决方案。

SELECT  
  a.DayDate AS StartDate,  

  ( SELECT b.DayDate  
    FROM Dates AS b  
    WHERE b.DayDate > a.DayDate AND (b.B = a.B OR b.B IS NULL)  
    ORDER BY b.DayDate ASC LIMIT 1 
  ) AS StopDate,
a.A as A,
    a.B AS B

FROM Dates AS a 
WHERE Coalesce( 
               (SELECT c.B  
                FROM Dates AS c  
                WHERE c.DayDate <= a.DayDate  
                ORDER BY c.DayDate DESC LIMIT 1,1  
               ), -99999  
              ) <> a.B 
  AND a.B IS NOT NULL 
ORDER BY a.DayDate ASC; 

能够生成下表结果

StartDate   StopDate    A   B
2010-07-01  2010-07-02  200 300
2010-07-03  2010-07-04  150 250
2010-07-05  NULL        150 350
2010-07-06  2010-07-07  200 300
2010-07-08  2010-07-09  100 200

,但我需要一种方法将 NULL 替换为与开始日期相同的日期。

I think I have found a solution which does produce the table desired.

SELECT  
  a.DayDate AS StartDate,  

  ( SELECT b.DayDate  
    FROM Dates AS b  
    WHERE b.DayDate > a.DayDate AND (b.B = a.B OR b.B IS NULL)  
    ORDER BY b.DayDate ASC LIMIT 1 
  ) AS StopDate,
a.A as A,
    a.B AS B

FROM Dates AS a 
WHERE Coalesce( 
               (SELECT c.B  
                FROM Dates AS c  
                WHERE c.DayDate <= a.DayDate  
                ORDER BY c.DayDate DESC LIMIT 1,1  
               ), -99999  
              ) <> a.B 
  AND a.B IS NOT NULL 
ORDER BY a.DayDate ASC; 

is able to generate the following table result

StartDate   StopDate    A   B
2010-07-01  2010-07-02  200 300
2010-07-03  2010-07-04  150 250
2010-07-05  NULL        150 350
2010-07-06  2010-07-07  200 300
2010-07-08  2010-07-09  100 200

But I need a way to replace the NULL with the same date of the start date.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文