最大计数范围交集(在 T-SQL 中)

发布于 2024-08-05 08:23:26 字数 355 浏览 3 评论 0原文

假设我有一个包含一堆日期的表,例如:

declare @tbl table {
    idx int primary key,
    startdate datetime,
    enddate datetime
}

我想找到开始日期和结束日期相交的最大行集(在现实世界中,开始日期和结束日期代表事件的开始和结束时间,并且我需要找到同时发生的最大事件数)。

在另一种编程语言中,我可能会按开始日期对所有条目进行排序,然后迭代每个条目一次,构建临时的交集集(跟踪生成的最大集)。但我不确定这是否是在 T-SQL 中表达这一点的最有效方法。帮助!

哦,它是 SQL Server 2000。:(

Let's say I have a table with a bunch of dates, e.g.:

declare @tbl table {
    idx int primary key,
    startdate datetime,
    enddate datetime
}

And I want to find the largest set of rows where startdate and enddate intersect (in the real world, the start date and end date represents start and end times for events, and I need to find the maximum # of events occurring simultaneously).

In another programming language I might sort all entries by startdate, then iterate through each entry once, building a temporary set of intersections (keeping track of the largest set generated). But I'm not sure if this is the most efficient way to express this in T-SQL. help!

Oh, and it's SQL Server 2000. :(

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

煮酒 2024-08-12 08:23:26

更新以删除联合所有

declare @tbl table (
idx int identity(1,1) primary key,    
startdate datetime,    
enddate datetime);

insert into @tbl (startdate, enddate) 
select '2009-01-01', '2009-01-05'
union all select '2009-01-02', '2009-01-04'
union all select '2009-01-01', '2009-01-03'
union all select '2009-01-03', '2009-01-06'
union all select '2009-01-04', '2009-01-07'
union all select '2009-01-05', '2009-01-08'

select idx, startdate
   , (select sum(in_or_out) 
from (
   select case when startdate<=all_events.startdate then 1 else 0 end
     + case when enddate <= all_events.startdate then -1 else 0 end as in_or_out
   from @tbl 
   where startdate <= all_events.startdate
     or enddate <= all_events.startdate) as previous
) as concurent
from @tbl all_events
order by startdate

这给出了开始会话的时间线,以及新会话开始时的并发会话计数:

idx startdate   concurent
3   2009-01-01 00:00:00.000 2
1   2009-01-01 00:00:00.000 2
2   2009-01-02 00:00:00.000 3
4   2009-01-03 00:00:00.000 3
5   2009-01-04 00:00:00.000 3
6   2009-01-05 00:00:00.000 3

要获取原始请求(具有最大并发性的并发会话集),您需要要运行此查询两次,一次获取最大并发会话数,一次获取具有最大并发次数的会话的开始日期,那么您必须获取这些会话。

已更新

好的,这里是检索最大并发会话的单个查询。我更改了测试数据以删除结束和开始的 ambibuos 重叠:

declare @tbl table (
idx int identity(1,1) primary key,    
startdate datetime,    
enddate datetime);

insert into @tbl (startdate, enddate) 
select '2009-01-01', '2009-01-04 23:59:59'
union all select '2009-01-02', '2009-01-03 23:59:59'
union all select '2009-01-01', '2009-01-02 23:59:59'
union all select '2009-01-03', '2009-01-03 23:59:59'
union all select '2009-01-04', '2009-01-04 23:59:59'
union all select '2009-01-05', '2009-01-05 23:59:59'


select max_concurent_starts.startdate as concurentdate
  , session.*
from (
  select *
  ,(
        select sum(in_or_out) 
        from (
            select case when startdate<=all_events.startdate then 1 else 0 end
                + case when enddate <= all_events.startdate then -1 else 0 end 
                as in_or_out
          from @tbl 
          where startdate <= all_events.startdate
              or enddate <= all_events.startdate) as previous
    ) as concurent
  from @tbl all_events) as max_concurent_starts
  join @tbl as session 
     on session.startdate <= max_concurent_starts.startdate 
     and session.enddate >= max_concurent_starts.startdate
  where concurent = (
  select top 1 concurent
  from (
      select (
          select sum(in_or_out) 
          from (
              select case when startdate<=all_events.startdate then 1 else 0 end
                  + case when enddate <= all_events.startdate then -1 else 0 end 
                  as in_or_out
            from @tbl 
            where startdate <= all_events.startdate
                or enddate <= all_events.startdate) as previous
      ) as concurent
    from @tbl all_events) as all_events_with_concurent
    order by concurent desc)
  order by concurentdate, startdate;

这给出了如下结果:

concurentdate   idx startdate   enddate
2009-01-02 00:00:00.000 3   2009-01-01 00:00:00.000 2009-01-02 23:59:59.000
2009-01-02 00:00:00.000 1   2009-01-01 00:00:00.000 2009-01-04 23:59:59.000
2009-01-02 00:00:00.000 2   2009-01-02 00:00:00.000 2009-01-03 23:59:59.000
2009-01-03 00:00:00.000 1   2009-01-01 00:00:00.000 2009-01-04 23:59:59.000
2009-01-03 00:00:00.000 2   2009-01-02 00:00:00.000 2009-01-03 23:59:59.000
2009-01-03 00:00:00.000 4   2009-01-03 00:00:00.000 2009-01-03 23:59:59.000

其内容如下:在 2009-01-02 00:00:00 上有 3 个并发会话 (3 、 1 和 2) 以及它们各自的开始和结束。有一个平局,在 2009-01-03 00:00:00 上还有 3 个并发会话(1、2 和 4),其各自的开始和结束。

性能里程可能会有所不同。使用 CTE 在 SQL 2005 中可以将查询编写简单一百万倍。

Updated to remove the union all

declare @tbl table (
idx int identity(1,1) primary key,    
startdate datetime,    
enddate datetime);

insert into @tbl (startdate, enddate) 
select '2009-01-01', '2009-01-05'
union all select '2009-01-02', '2009-01-04'
union all select '2009-01-01', '2009-01-03'
union all select '2009-01-03', '2009-01-06'
union all select '2009-01-04', '2009-01-07'
union all select '2009-01-05', '2009-01-08'

select idx, startdate
   , (select sum(in_or_out) 
from (
   select case when startdate<=all_events.startdate then 1 else 0 end
     + case when enddate <= all_events.startdate then -1 else 0 end as in_or_out
   from @tbl 
   where startdate <= all_events.startdate
     or enddate <= all_events.startdate) as previous
) as concurent
from @tbl all_events
order by startdate

This gives the timeline of start session, with the count of concurent sessions at the moment new session starts:

idx startdate   concurent
3   2009-01-01 00:00:00.000 2
1   2009-01-01 00:00:00.000 2
2   2009-01-02 00:00:00.000 3
4   2009-01-03 00:00:00.000 3
5   2009-01-04 00:00:00.000 3
6   2009-01-05 00:00:00.000 3

To get the original request (set of concurent sessions with max concurency) you need to run this query twice, once to get the max concurent sessions and once to get the start dates of the sessions that have max concurent times, then you must get those sessions.

Updated

OK, so here the one single query that retrieves the max concurent sessions. I changed the test data to remove ambibuos overlaps of end and start:

declare @tbl table (
idx int identity(1,1) primary key,    
startdate datetime,    
enddate datetime);

insert into @tbl (startdate, enddate) 
select '2009-01-01', '2009-01-04 23:59:59'
union all select '2009-01-02', '2009-01-03 23:59:59'
union all select '2009-01-01', '2009-01-02 23:59:59'
union all select '2009-01-03', '2009-01-03 23:59:59'
union all select '2009-01-04', '2009-01-04 23:59:59'
union all select '2009-01-05', '2009-01-05 23:59:59'


select max_concurent_starts.startdate as concurentdate
  , session.*
from (
  select *
  ,(
        select sum(in_or_out) 
        from (
            select case when startdate<=all_events.startdate then 1 else 0 end
                + case when enddate <= all_events.startdate then -1 else 0 end 
                as in_or_out
          from @tbl 
          where startdate <= all_events.startdate
              or enddate <= all_events.startdate) as previous
    ) as concurent
  from @tbl all_events) as max_concurent_starts
  join @tbl as session 
     on session.startdate <= max_concurent_starts.startdate 
     and session.enddate >= max_concurent_starts.startdate
  where concurent = (
  select top 1 concurent
  from (
      select (
          select sum(in_or_out) 
          from (
              select case when startdate<=all_events.startdate then 1 else 0 end
                  + case when enddate <= all_events.startdate then -1 else 0 end 
                  as in_or_out
            from @tbl 
            where startdate <= all_events.startdate
                or enddate <= all_events.startdate) as previous
      ) as concurent
    from @tbl all_events) as all_events_with_concurent
    order by concurent desc)
  order by concurentdate, startdate;

This gives a result like:

concurentdate   idx startdate   enddate
2009-01-02 00:00:00.000 3   2009-01-01 00:00:00.000 2009-01-02 23:59:59.000
2009-01-02 00:00:00.000 1   2009-01-01 00:00:00.000 2009-01-04 23:59:59.000
2009-01-02 00:00:00.000 2   2009-01-02 00:00:00.000 2009-01-03 23:59:59.000
2009-01-03 00:00:00.000 1   2009-01-01 00:00:00.000 2009-01-04 23:59:59.000
2009-01-03 00:00:00.000 2   2009-01-02 00:00:00.000 2009-01-03 23:59:59.000
2009-01-03 00:00:00.000 4   2009-01-03 00:00:00.000 2009-01-03 23:59:59.000

which reads as follows: on 2009-01-02 00:00:00 there were 3 concurent sessions (3, 1 and 2) with they respective starts and ends. There is a tie, on 2009-01-03 00:00:00 there were also 3 concurent sessions (1, 2 and 4) with their respective starts and ends.

Performance milage may vary. The query can be written 1 million times simpler in SQL 2005 using CTEs.

小镇女孩 2024-08-12 08:23:26

试试这个(它接近你想要的我想......

Select Distinct EventId 
From EventTable Et
Join  (Select Top 1 RunDate, Count(*) DateCount
       From 
          (Select Distinct StartDate RunDate
           From EventTable
               Union  
           Select Distinct EndDate RunDate
           From EventTable) A
         Join EventTable E
            On A.RunDate Between E.StartDate And E.EndDate
       Group By RunDate
       Order By Count(*) Desc) Z
   On Z.RunDate Between Et.StartDate and Et.EndDate

哦,如果你的日期中有日期和时间,那么仅用实际日期部分替换此处的所有日期(去掉时间)

Select Distinct EventId 
From EventTable Et
Join  (Select Top 1 RunDate, Count(*) DateCount
       From 
          (Select Distinct DateAdd(day, 0, DateDiff(day, 0, StartDate)) RunDate
           From EventTable
               Union  
           Select Distinct DateAdd(day, 0, DateDiff(day, -1, EndDate)) RunDate
           From EventTable) A
         Join EventTable E
            On A.RunDate Between DateAdd(day, 0, DateDiff(day, 0, E.StartDate))
                             and DateAdd(day, 0, DateDiff(day, -1, E.EndDate))
       Group By RunDate
       Order By Count(*) Desc) Z
   On Z.RunDate Between DateAdd(day, 0, DateDiff(day, 0, Et.StartDate))
                    and DateAdd(day, 0, DateDiff(day, -1, Et.EndDate))

try this (it's close to what you want I think...

Select Distinct EventId 
From EventTable Et
Join  (Select Top 1 RunDate, Count(*) DateCount
       From 
          (Select Distinct StartDate RunDate
           From EventTable
               Union  
           Select Distinct EndDate RunDate
           From EventTable) A
         Join EventTable E
            On A.RunDate Between E.StartDate And E.EndDate
       Group By RunDate
       Order By Count(*) Desc) Z
   On Z.RunDate Between Et.StartDate and Et.EndDate

oh, If your dates have date and Time in them, then replace all the dates herein with actual date portion only (strip off the time)

Select Distinct EventId 
From EventTable Et
Join  (Select Top 1 RunDate, Count(*) DateCount
       From 
          (Select Distinct DateAdd(day, 0, DateDiff(day, 0, StartDate)) RunDate
           From EventTable
               Union  
           Select Distinct DateAdd(day, 0, DateDiff(day, -1, EndDate)) RunDate
           From EventTable) A
         Join EventTable E
            On A.RunDate Between DateAdd(day, 0, DateDiff(day, 0, E.StartDate))
                             and DateAdd(day, 0, DateDiff(day, -1, E.EndDate))
       Group By RunDate
       Order By Count(*) Desc) Z
   On Z.RunDate Between DateAdd(day, 0, DateDiff(day, 0, Et.StartDate))
                    and DateAdd(day, 0, DateDiff(day, -1, Et.EndDate))
知你几分 2024-08-12 08:23:26

另一种方法:

DECLARE @idx INT,
        @startdate DATETIME,
    @enddate DATETIME,  
        @prev_enddate DATETIME,
        @counter INT,
    @counter_max INT

DECLARE db_cursor CURSOR FOR  
SELECT idx, startdate,enddate 
FROM @tbl
ORDER BY startdate,enddate

OPEN db_cursor   

FETCH NEXT FROM db_cursor INTO @idx, @startdate, @enddate
SET @prev_enddate = @enddate
SET @counter = 0
SET @counter_max = 0

WHILE @@FETCH_STATUS = 0   
BEGIN   
IF @startdate < @prev_enddate
BEGIN
    SET @counter = @counter + 1 
    IF @counter > @counter_max
    BEGIN
        SET @counter_max = @counter
    END
END
ELSE
BEGIN
    SET @counter = 1
END

SET @prev_enddate = @enddate
FETCH NEXT FROM db_cursor INTO @idx, @startdate, @enddate           
END   

CLOSE db_cursor   
DEALLOCATE db_cursor

SELECT @counter_max

Another approach:

DECLARE @idx INT,
        @startdate DATETIME,
    @enddate DATETIME,  
        @prev_enddate DATETIME,
        @counter INT,
    @counter_max INT

DECLARE db_cursor CURSOR FOR  
SELECT idx, startdate,enddate 
FROM @tbl
ORDER BY startdate,enddate

OPEN db_cursor   

FETCH NEXT FROM db_cursor INTO @idx, @startdate, @enddate
SET @prev_enddate = @enddate
SET @counter = 0
SET @counter_max = 0

WHILE @@FETCH_STATUS = 0   
BEGIN   
IF @startdate < @prev_enddate
BEGIN
    SET @counter = @counter + 1 
    IF @counter > @counter_max
    BEGIN
        SET @counter_max = @counter
    END
END
ELSE
BEGIN
    SET @counter = 1
END

SET @prev_enddate = @enddate
FETCH NEXT FROM db_cursor INTO @idx, @startdate, @enddate           
END   

CLOSE db_cursor   
DEALLOCATE db_cursor

SELECT @counter_max
转身以后 2024-08-12 08:23:26

这个很短,很容易理解并且工作得很好:

CREATE PROCEDURE FindEvents
AS
BEGIN
    DECLARE dates_cursor CURSOR FOR 
        SELECT
            startdate AS thedate, 1 AS change
        FROM
            dates
        UNION
        SELECT
            enddate AS thedate, - 1 AS change
        FROM
            dates
        ORDER BY 
            thedate ASC;

        DECLARE @max INT;
        DECLARE @thedate DATETIME;
        DECLARE @change INT;
        DECLARE @current INT;

        SET @max = 0;
        SET @current = 0;

    OPEN dates_cursor

    FETCH NEXT FROM dates_cursor INTO @thedate, @change

    WHILE @@FETCH_STATUS = 0
    BEGIN
        SET @current = @current + @change;
        IF (@current > @max)
        BEGIN
            SET @max = @current;
        END
        FETCH NEXT FROM dates_cursor INTO @thedate, @change
    END

    CLOSE dates_cursor
    DEALLOCATE dates_cursor

    SELECT @max;
END

This one is pretty short, easy to understand and works fine:

CREATE PROCEDURE FindEvents
AS
BEGIN
    DECLARE dates_cursor CURSOR FOR 
        SELECT
            startdate AS thedate, 1 AS change
        FROM
            dates
        UNION
        SELECT
            enddate AS thedate, - 1 AS change
        FROM
            dates
        ORDER BY 
            thedate ASC;

        DECLARE @max INT;
        DECLARE @thedate DATETIME;
        DECLARE @change INT;
        DECLARE @current INT;

        SET @max = 0;
        SET @current = 0;

    OPEN dates_cursor

    FETCH NEXT FROM dates_cursor INTO @thedate, @change

    WHILE @@FETCH_STATUS = 0
    BEGIN
        SET @current = @current + @change;
        IF (@current > @max)
        BEGIN
            SET @max = @current;
        END
        FETCH NEXT FROM dates_cursor INTO @thedate, @change
    END

    CLOSE dates_cursor
    DEALLOCATE dates_cursor

    SELECT @max;
END
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文