当前位置：文江博客话题详情

在 T-SQL 中查找开始和结束日期（基于设置）

发布于 2024-10-10 20:29:46 字数 1036 浏览 5 评论 0 原文

我有以下内容。

Name    Date
A   2011-01-01 01:00:00.000
A   2011-02-01 02:00:00.000
A   2011-03-01 03:00:00.000
B   2011-04-01 04:00:00.000
A   2011-05-01 07:00:00.000

所需的输出是

Name       StartDate                        EndDate
-------------------------------------------------------------------
A          2011-01-01 01:00:00.000         2011-04-01 04:00:00.000    
B          2011-04-01 04:00:00.000         2011-05-01 07:00:00.000    
A          2011-05-01 07:00:00.000         NULL

如何在基于集合的方法中使用 TSQL 实现相同的目标。

DDL 如下

DECLARE @t TABLE(PersonName VARCHAR(32), [Date] DATETIME) 
INSERT INTO @t VALUES('A', '2011-01-01 01:00:00') 
INSERT INTO @t VALUES('A', '2011-01-02 02:00:00') 
INSERT INTO @t VALUES('A', '2011-01-03 03:00:00') 
INSERT INTO @t VALUES('B', '2011-01-04 04:00:00') 
INSERT INTO @t VALUES('A', '2011-01-05 07:00:00')

Select * from @t

原文

I have the below.

Name    Date
A   2011-01-01 01:00:00.000
A   2011-02-01 02:00:00.000
A   2011-03-01 03:00:00.000
B   2011-04-01 04:00:00.000
A   2011-05-01 07:00:00.000

The desired output is

Name       StartDate                        EndDate
-------------------------------------------------------------------
A          2011-01-01 01:00:00.000         2011-04-01 04:00:00.000    
B          2011-04-01 04:00:00.000         2011-05-01 07:00:00.000    
A          2011-05-01 07:00:00.000         NULL

How to achieve the same using TSQL in a set based approach.

DDL is as under

DECLARE @t TABLE(PersonName VARCHAR(32), [Date] DATETIME) 
INSERT INTO @t VALUES('A', '2011-01-01 01:00:00') 
INSERT INTO @t VALUES('A', '2011-01-02 02:00:00') 
INSERT INTO @t VALUES('A', '2011-01-03 03:00:00') 
INSERT INTO @t VALUES('B', '2011-01-04 04:00:00') 
INSERT INTO @t VALUES('A', '2011-01-05 07:00:00')

Select * from @t

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

南城旧梦 2024-10-17 20:29:46

;WITH cte1
     AS (SELECT *,
                ROW_NUMBER() OVER (ORDER BY Date) -
                ROW_NUMBER() OVER (PARTITION BY PersonName
                ORDER BY Date) AS G
         FROM   @t),
     cte2
     AS (SELECT PersonName,
                MIN([Date]) StartDate,
                ROW_NUMBER() OVER (ORDER BY  MIN([Date])) AS rn
         FROM   cte1
         GROUP  BY PersonName,
                   G)
SELECT a.PersonName,
       a.StartDate,
       b.StartDate AS EndDate
FROM   cte2 a
       LEFT JOIN cte2 b
         ON a.rn + 1 = b.rn

因为 CTE 的结果通常不会具体化
如果你实现了以下目标，你可能会发现你会获得更好的表现
自己的中间结果如下。

DECLARE @t2 TABLE (
  rn         INT IDENTITY(1, 1) PRIMARY KEY,
  PersonName VARCHAR(32),
  StartDate  DATETIME );

INSERT INTO @t2
SELECT PersonName,
       MIN([Date]) StartDate
FROM   (SELECT *,
               ROW_NUMBER() OVER (ORDER BY Date) -
               ROW_NUMBER() OVER (PARTITION BY PersonName
               ORDER BY Date) AS G
        FROM   @t) t
GROUP  BY PersonName,
          G
ORDER  BY StartDate

SELECT a.PersonName,
       a.StartDate,
       b.StartDate AS EndDate
FROM   @t2 a
       LEFT JOIN @t2 b
         ON a.rn + 1 = b.rn

;WITH cte1
     AS (SELECT *,
                ROW_NUMBER() OVER (ORDER BY Date) -
                ROW_NUMBER() OVER (PARTITION BY PersonName
                ORDER BY Date) AS G
         FROM   @t),
     cte2
     AS (SELECT PersonName,
                MIN([Date]) StartDate,
                ROW_NUMBER() OVER (ORDER BY  MIN([Date])) AS rn
         FROM   cte1
         GROUP  BY PersonName,
                   G)
SELECT a.PersonName,
       a.StartDate,
       b.StartDate AS EndDate
FROM   cte2 a
       LEFT JOIN cte2 b
         ON a.rn + 1 = b.rn

Because the result of CTEs are not generally materialised however
you may well find you get better performance if you materialize the
intermediate result yourself as below.

DECLARE @t2 TABLE (
  rn         INT IDENTITY(1, 1) PRIMARY KEY,
  PersonName VARCHAR(32),
  StartDate  DATETIME );

INSERT INTO @t2
SELECT PersonName,
       MIN([Date]) StartDate
FROM   (SELECT *,
               ROW_NUMBER() OVER (ORDER BY Date) -
               ROW_NUMBER() OVER (PARTITION BY PersonName
               ORDER BY Date) AS G
        FROM   @t) t
GROUP  BY PersonName,
          G
ORDER  BY StartDate

SELECT a.PersonName,
       a.StartDate,
       b.StartDate AS EndDate
FROM   @t2 a
       LEFT JOIN @t2 b
         ON a.rn + 1 = b.rn

回复收藏 0 原文

三生殊途 2024-10-17 20:29:46

SELECT
  PersonName,
  StartDate = MIN(Date),
  EndDate
FROM (
  SELECT
    PersonName,
    Date,
    EndDate = (
      /* get the earliest date after current date
         associated with a different person */
      SELECT MIN(t1.Date)
      FROM @t AS t1
      WHERE t1.Date > t.Date
        AND t1.PersonName <> t.PersonName
    )
  FROM @t AS t
) s
GROUP BY PersonName, EndDate
ORDER BY 2

基本上，对于每个 Date，我们都会找到其后最近的日期，以便与不同的 PersonName 关联。这给了我们 EndDate，它现在可以区分同一个人的连续日期组。

现在我们只需要按 PersonName & 对数据进行分组即可。 EndDate 并获取每个组中最小的Date 作为StartDate。是的，当然可以按 StartDate 对数据进行排序。

SELECT
  PersonName,
  StartDate = MIN(Date),
  EndDate
FROM (
  SELECT
    PersonName,
    Date,
    EndDate = (
      /* get the earliest date after current date
         associated with a different person */
      SELECT MIN(t1.Date)
      FROM @t AS t1
      WHERE t1.Date > t.Date
        AND t1.PersonName <> t.PersonName
    )
  FROM @t AS t
) s
GROUP BY PersonName, EndDate
ORDER BY 2

Basically, for every Date we find the nearest date after it such that is associated with a different PersonName. That gives us EndDate, which now distinguishes for us consecutive groups of dates for the same person.

Now we only need to group the data by PersonName & EndDate and get the minimal Date in every group as StartDate. And yes, sort the data by StartDate, of course.

回复收藏 0 原文

你与昨日 2024-10-17 20:29:46

获取行号，以便您知道上一条记录在哪里。然后，记录一条记录及其后的下一条记录。当状态改变时，我们有一个候选行。

select 
  state, 
  min(start_timestamp),
  max(end_timestamp)

from
(
    select
        first.state, 
        first.timestamp_ as start_timestamp,
        second.timestamp_ as end_timestamp

        from
        (
            select
                *, row_number() over (order by timestamp_) as id
            from test
        ) as first

        left outer join
        (
            select
                *, row_number() over (order by timestamp_) as id
            from test
        ) as second
        on 
            first.id = second.id - 1 
            and first.state != second.state
) as agg
group by state
    having max(end_timestamp) is not null 

union

-- last row wont have a ending row
--(select state, timestamp_, null from test order by timestamp_ desc limit 1)
    -- I think it something like this for sql server
     (select top state, timestamp_, null from test order by timestamp_ desc)

order by 2
;

使用 PostgreSQL 进行了测试，但也应该适用于 SQL Server

Get a row number so you will know where the previous record is. Then, take a record and the next record after it. When the state changes we have a candidate row.

select 
  state, 
  min(start_timestamp),
  max(end_timestamp)

from
(
    select
        first.state, 
        first.timestamp_ as start_timestamp,
        second.timestamp_ as end_timestamp

        from
        (
            select
                *, row_number() over (order by timestamp_) as id
            from test
        ) as first

        left outer join
        (
            select
                *, row_number() over (order by timestamp_) as id
            from test
        ) as second
        on 
            first.id = second.id - 1 
            and first.state != second.state
) as agg
group by state
    having max(end_timestamp) is not null 

union

-- last row wont have a ending row
--(select state, timestamp_, null from test order by timestamp_ desc limit 1)
    -- I think it something like this for sql server
     (select top state, timestamp_, null from test order by timestamp_ desc)

order by 2
;

Tested with PostgreSQL but should work with SQL Server as well

回复收藏 0 原文

小兔几 2024-10-17 20:29:46

cte 的另一个答案是一个很好的答案。另一种选择是在任何情况下迭代集合。它不是基于设置的，但它是另一种方法。

您将需要迭代到 A. 为与其事务相对应的每条记录分配一个唯一的 id，或者 B. 以实际获取输出。

TSQL 并不适合迭代记录，尤其是当您有很多记录时，因此我会推荐其他一些方法，例如小型 .net 程序或更擅长迭代的程序。

回复收藏 0 原文

牛↙奶布丁 2024-10-17 20:29:46

有一种非常快速的方法可以使用一些间隙和岛屿理论来做到这一点：

WITH CTE as (SELECT PersonName, [Date]
                   , Row_Number() over (ORDER BY [Date])
                     - Row_Number() over (ORDER BY PersonName, [Date]) as Island
             FROM @t)

Select PersonName, Min([Date]), Max([Date])
from CTE
GROUP BY Island, PersonName
ORDER BY Min([Date])

There's a very quick way to do this using a bit of Gaps and Islands theory:

WITH CTE as (SELECT PersonName, [Date]
                   , Row_Number() over (ORDER BY [Date])
                     - Row_Number() over (ORDER BY PersonName, [Date]) as Island
             FROM @t)

Select PersonName, Min([Date]), Max([Date])
from CTE
GROUP BY Island, PersonName
ORDER BY Min([Date])

回复收藏 0 原文

~没有更多了~