从结果集中获取开始日期和结束日期的 SQL 查询

发布于 2025-01-18 10:43:01 字数 2651 浏览 2 评论 0原文

我正在处理一项要求,原始数据采用以下格式

要求 - 开始日期应该是状态更改为 1 时的日期,结束日期应该是记录状态从 1 更改为任何其他状态后的第一个日期数字。

客户状态日期
A12307/2/2021
A12307/15/2021
A12307/22/2021
A12318/18/2021
A12319/8/2021
A123012/1/2021
A12301/21 /2022
A12313/6/2022
A12313/7/2022
A12303/15/2022
B12311/1/2022
B12301/6/2022
C12311/2/2022
C12321/8/2022
C12302022 年 1 月 9 日

预期输出

客户StartDateEndDate
A1238/18/202112/1/2021
A1239/8/202112/1/2021
A1233/6/20223/15/2022
A1233/7/ 20222022 年 3 月 15 日
B1231/1/20221/6/2022
C1231/2/20221/8/2022

我试图获取的输出如下,我得到了客户 B123 和 C123 的输出,但没有得到预期的 A123 的输出。

查询说明 - 在查询的第一部分中,我将获取状态 = 1 的所有记录,在下一部分中仅获取状态不等于 1 的记录,并根据客户和客户连接这两个数据集生成的行号。

SELECT A.[Customer],A.StartDate,B.EndDate
    from
    (
         SELECT [Customer],MIN(Date) AS STARTDATE,[Status],RANK() OVER (PARTITION BY [STATUS] ORDER BY Date ASC) AS ROWNUM
         FROM table1
         WHERE [STATUS] = 1
         GROUP BY Customer,Date,[Status]
     ) A
     LEFT JOIN
     (
        SELECT [Customer],MIN(Date) AS ENDDATE,[Status],RANK() OVER (PARTITION BY [STATUS] ORDER BY Date ASC) AS ROWNUM
         FROM table1
         WHERE [STATUS] != 1 
         AND Date>(
                    SELECT MIN(Date) AS STARTDATE
                    FROM table1
                    WHERE [STATUS] = 1
                )
     GROUP BY Customer,Date,[Status]
     ) B 
        ON 
        (
            A.[Customer] = B.[Customer] 
            AND A.RowNum = B.RowNum
        )
     ORDER BY A.Startdate

I am working on one of requirement the raw data is in following format

Requirement - Startdate should be the date when status changed to 1 and enddate should be the 1st date after the record status changed from 1 to any other number.

CustomerStatusDate
A12307/2/2021
A12307/15/2021
A12307/22/2021
A12318/18/2021
A12319/8/2021
A123012/1/2021
A12301/21/2022
A12313/6/2022
A12313/7/2022
A12303/15/2022
B12311/1/2022
B12301/6/2022
C12311/2/2022
C12321/8/2022
C12301/9/2022

expected output

CustomerStartDateEndDate
A1238/18/202112/1/2021
A1239/8/202112/1/2021
A1233/6/20223/15/2022
A1233/7/20223/15/2022
B1231/1/20221/6/2022
C1231/2/20221/8/2022

Query I tried to get the output is below, I am getting the output for Customer B123 and C123, but not for A123 as expected.

Query Explanation - In 1st part of query I am taking all the records with status = 1 and in next part taking only those records where status is not equal to 1, and joining these 2 datasets based on Customer and row number generated.

SELECT A.[Customer],A.StartDate,B.EndDate
    from
    (
         SELECT [Customer],MIN(Date) AS STARTDATE,[Status],RANK() OVER (PARTITION BY [STATUS] ORDER BY Date ASC) AS ROWNUM
         FROM table1
         WHERE [STATUS] = 1
         GROUP BY Customer,Date,[Status]
     ) A
     LEFT JOIN
     (
        SELECT [Customer],MIN(Date) AS ENDDATE,[Status],RANK() OVER (PARTITION BY [STATUS] ORDER BY Date ASC) AS ROWNUM
         FROM table1
         WHERE [STATUS] != 1 
         AND Date>(
                    SELECT MIN(Date) AS STARTDATE
                    FROM table1
                    WHERE [STATUS] = 1
                )
     GROUP BY Customer,Date,[Status]
     ) B 
        ON 
        (
            A.[Customer] = B.[Customer] 
            AND A.RowNum = B.RowNum
        )
     ORDER BY A.Startdate

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

我不咬妳我踢妳 2025-01-25 10:43:01

首先,您列出status = 1的行,然后使用交叉应用获取状态不等于1的相应最小日期

select s.[Customer],
       StartDate = s.[Date],
       EndDate   = e.[Date]
from   Table1 s
       cross apply
       (
           select [Date] = min(e.[Date])
           from   Table1 e
           where  e.[Customer] = s.[Customer]
           and    e.[Date]     > s.[Date]
           and    e.[Status]   <> 1
       ) e
where  s.[Status] = 1 
order by s.[Customer], s.[Date]

First you list the rows where Status = 1 and then use CROSS APPLY to get the corresponding minimum Date where the Status is not equal to 1

select s.[Customer],
       StartDate = s.[Date],
       EndDate   = e.[Date]
from   Table1 s
       cross apply
       (
           select [Date] = min(e.[Date])
           from   Table1 e
           where  e.[Customer] = s.[Customer]
           and    e.[Date]     > s.[Date]
           and    e.[Status]   <> 1
       ) e
where  s.[Status] = 1 
order by s.[Customer], s.[Date]
旧夏天 2025-01-25 10:43:01

这是一种无需自连接即可更有效地执行此操作的方法。

WITH cte01only AS
  ( SELECT *, CASE Status WHEN 1 THEN 1 ELSE 0 END AS Status1 FROM table1 ),
  cteDifference AS
  (
    SELECT *,
        ROW_NUMBER() OVER (PARTITION BY Customer  ORDER BY Date, Status1)
        - ROW_NUMBER() OVER (PARTITION BY Customer, Status1  ORDER BY Date) AS StatusGroup
    FROM cte01only
  ),
  cteGroup AS
  (
    SELECT   Customer, StatusGroup, Status1, MIN(Date) As StartDate
    FROM cteDifference
    GROUP BY Customer, StatusGroup, Status1
  ),
  cteNextDate AS
  (
    SELECT Customer, StatusGroup, Status1, StartDate,
        LEAD(StartDate, 1, NULL) OVER (PARTITION BY Customer  ORDER BY StatusGroup) AS EndDate
    FROM cteGroup
  )
SELECT Customer, StartDate, EndDate
FROM cteNextDate
WHERE Status1 = 1
ORDER BY Customer, StateDate

这里的关键技巧是第二个 CTE,它使用两个 ROW_NUMBER() 函数的差异,通过连续运行将客户记录(使用 StatusGroup 列)标记到单独的分区中状态为 1 或非 1 的记录。之后可以根据该标签对它们进行分组以获取开始日期,然后使用 LEAD() 函数获取后面组的StartDate 作为当前分组EndDate

(可能有更紧凑的方式来表达这一点,但我喜欢将每个阶段布局为单独的 CTE。)

Here is a more efficient way to do this without a self-join.

WITH cte01only AS
  ( SELECT *, CASE Status WHEN 1 THEN 1 ELSE 0 END AS Status1 FROM table1 ),
  cteDifference AS
  (
    SELECT *,
        ROW_NUMBER() OVER (PARTITION BY Customer  ORDER BY Date, Status1)
        - ROW_NUMBER() OVER (PARTITION BY Customer, Status1  ORDER BY Date) AS StatusGroup
    FROM cte01only
  ),
  cteGroup AS
  (
    SELECT   Customer, StatusGroup, Status1, MIN(Date) As StartDate
    FROM cteDifference
    GROUP BY Customer, StatusGroup, Status1
  ),
  cteNextDate AS
  (
    SELECT Customer, StatusGroup, Status1, StartDate,
        LEAD(StartDate, 1, NULL) OVER (PARTITION BY Customer  ORDER BY StatusGroup) AS EndDate
    FROM cteGroup
  )
SELECT Customer, StartDate, EndDate
FROM cteNextDate
WHERE Status1 = 1
ORDER BY Customer, StateDate

The key trick here is the second CTE which uses the difference of two ROW_NUMBER() functions to tag the customer records (with the StatusGroup column) into separate partitions by contiguous runs of records whose status is 1 or not 1. After that they can be grouped according to that tag to get the start dates, and then use the LEAD() function to get the following group's StartDate as the current groupings EndDate.

(There may be a more compact way to express this, but I like to layout each stage as a separate CTE.)

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文