SQL Server 中的重叠时间

发布于 2024-08-17 13:03:38 字数 632 浏览 4 评论 0原文

我有一个像这样的表,

ColumnId Intime                   Outtime
1        01/02/2009 10.00.000    01/02/2009 20.00.0000  
2        01/02/2009 2.00.000    01/02/2009 2.00.0000 
3        01/02/2009 2.00.000    01/02/2009 5.00.0000 
4        01/02/2009 3.3.0.000    01/02/2009 5.00.0000 
5        01/02/2009 10.00.000    01/02/2009 22.00.0000 
6        01/02/2009 3.00.000    01/02/2009 4.00.0000 

我有这样的列和值。我喜欢查找重叠记录以及特定日期有多少重叠记录。一天中1-24点的时间重叠。

注意:- 我的表有数百万条记录。

例如,在第一个值中,登录为 10,注销为 20。在 5 中,记录在 10 登录并在 22 注销,因此第 5 个值与第一个值重叠。表中没有可用的指数。

请为我解答我的疑问。

我需要在 SQL Server 2005 中执行查询

I am having a table like this

ColumnId Intime                   Outtime
1        01/02/2009 10.00.000    01/02/2009 20.00.0000  
2        01/02/2009 2.00.000    01/02/2009 2.00.0000 
3        01/02/2009 2.00.000    01/02/2009 5.00.0000 
4        01/02/2009 3.3.0.000    01/02/2009 5.00.0000 
5        01/02/2009 10.00.000    01/02/2009 22.00.0000 
6        01/02/2009 3.00.000    01/02/2009 4.00.0000 

I am having columns and values like this. I like to find the overlapping records and how many overlapping records for the particular date. Overlapping for time from 1-24 in a day.

Note:- My table has millions of records.

for example in first value login an 10 and logged out 20. and in 5the record login at 10 and logged out at 22 so 5th overlapped with first. No Indices available in the table.

Please get me the answer for my query.

I need the query to execute in SQL Server 2005

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

无声无音无过去 2024-08-24 13:03:39

在我的脑海中,假设两列都有索引,您可以使用类似这样的内容:

SELECT a.ColumnId
      ,a.InTime
      ,a.OutTime
      ,b.ColumnId AS OverlappingId
      ,b.InTime   AS OverlappingInTime
      ,b.OutTime  AS OverlappingOutTime
  FROM TimeTable  AS a
  JOIN TimeTable  AS b ON ((a.InTime  BETWEEN b.InTime AND b.OutTime)
                       OR  (a.OutTime BETWEEN b.InTime AND b.OutTime) 
                       OR  (a.InTime < b.InTime AND a.OutIme > b.OutTime))
                      AND  (a.ColumnId != b.ColumnId)

但我真的不确定这个查询在您提到的包含数百万条记录的表中的性能。

编辑为添加,并再次编辑:

Vadim K.的评论之后,我注意到我之前编写的查询在重叠完全时缺少一种情况,即一个范围完全覆盖另一个范围。上面是我修改后的查询,下面是原始查询:

SELECT a.ColumnId 
      ,a.InTime 
      ,a.OutTime 
      ,b.ColumnId AS OverlappingId 
      ,b.InTime   AS OverlappingInTime 
      ,b.OutTime  AS OverlappingOutTime 
  FROM TimeTable  AS a 
  JOIN TimeTable  AS b ON ((a.InTime  BETWEEN b.InTime AND b.OutTime) 
                       OR  (a.OutTime BETWEEN b.InTime AND b.OutTime)) 
                      AND  (a.ColumnId != b.ColumnId) 

使用问题初始数据进行测试运行:

+--------+------------------+------------------+
|ColumnId| InTime           | OutTime          |
+--------+------------------+------------------+
|      1 | 01/02/2009 10:00 | 01/02/2009 20:00 |   
|      2 | 01/02/2009  2:00 | 01/02/2009  2:00 |  
|      3 | 01/02/2009  2:00 | 01/02/2009  5:00 |  
|      4 | 01/02/2009  3:03 | 01/02/2009  5:00 |  
|      5 | 01/02/2009 10:00 | 01/02/2009 22:00 |  
|      6 | 01/02/2009  3:00 | 01/02/2009  4:00 |  
+--------+------------------+------------------+

运行原始查询,我们得到以下结果:

+--------+------------------+------------------+-------------+
|ColumnId| InTime           | OutTime          |OverlappingId|
+--------+------------------+------------------+-------------+
|      1 | 01/02/2009 10:00 | 01/02/2009 20:00 |           5 |
|      2 | 01/02/2009  2:00 | 01/02/2009  2:00 |           3 |
|      3 | 01/02/2009  2:00 | 01/02/2009  5:00 |           2 |
|      3 | 01/02/2009  2:00 | 01/02/2009  5:00 |           4 |
|      4 | 01/02/2009  3:03 | 01/02/2009  5:00 |           3 |
|      4 | 01/02/2009  3:03 | 01/02/2009  5:00 |           6 |
|      5 | 01/02/2009 10:00 | 01/02/2009 22:00 |           1 |
|      6 | 01/02/2009  3:00 | 01/02/2009  4:00 |           3 |
|      6 | 01/02/2009  3:00 | 01/02/2009  4:00 |           4 |
+--------+------------------+------------------+-------------+

运行更新后的查询,我们得到以下结果:

+--------+------------------+------------------+-------------+
|ColumnId| InTime           | OutTime          |OverlappingId|
+--------+------------------+------------------+-------------+
|      1 | 01/02/2009 10:00 | 01/02/2009 20:00 |           5 |
|      2 | 01/02/2009  2:00 | 01/02/2009  2:00 |           3 |
|      3 | 01/02/2009  2:00 | 01/02/2009  5:00 |           2 |
|      3 | 01/02/2009  2:00 | 01/02/2009  5:00 |           4 |
|      3 | 01/02/2009  2:00 | 01/02/2009  5:00 |           6 | << missing row
|      4 | 01/02/2009  3:03 | 01/02/2009  5:00 |           3 |
|      4 | 01/02/2009  3:03 | 01/02/2009  5:00 |           6 |
|      5 | 01/02/2009 10:00 | 01/02/2009 22:00 |           1 |
|      6 | 01/02/2009  3:00 | 01/02/2009  4:00 |           3 |
|      6 | 01/02/2009  3:00 | 01/02/2009  4:00 |           4 |
+--------+------------------+------------------+-------------+

是的,有一些 ID是重复的,但那是因为它们与不同的记录重叠。

该问题还询问重叠行的数量。我不确定,而且问题还不够清楚,是否需要原始表的重叠行数。

有些人建议使用 a.ColumnId a.ColumnId a.ColumnId a.ColumnId a.ColumnId < b.ColumnIda.ColumnId > b.ColumnId 为了避免重复,但是,它仍然不起作用,因为如果我们进行第一次比较,我们会得到以下结果:

+--------+------------------+------------------+-------------+
|ColumnId| InTime           | OutTime          |OverlappingId|
+--------+------------------+------------------+-------------+
|      1 | 01/02/2009 10:00 | 01/02/2009 20:00 |           5 |
|      2 | 01/02/2009  2:00 | 01/02/2009  2:00 |           3 |
|      3 | 01/02/2009  2:00 | 01/02/2009  5:00 |           4 |
|      3 | 01/02/2009  2:00 | 01/02/2009  5:00 |           6 | 
|      4 | 01/02/2009  3:03 | 01/02/2009  5:00 |           6 |
+--------+------------------+------------------+-------------+

如果您注意到示例数据的所有 6 行都被引用结果虽然只有 5 行。我相信,对于这些数据,如果所有行在某一点或另一点彼此重叠,则重叠行的数量为 6。

为了获得此结果,下面的查询可以是used:

SELECT COUNT (DISTINCT a.ColumnId)
  FROM TimeTable  AS a
  JOIN TimeTable  AS b ON ((a.InTime  BETWEEN b.InTime AND b.OutTime)
                       OR  (a.OutTime BETWEEN b.InTime AND b.OutTime) 
                       OR  (a.InTime < b.InTime AND a.OutIme > b.OutTime))
                      AND  (a.ColumnId != b.ColumnId)

返回所有 6 行的计数。

Out of the tip of my head, and assuming index on both columns, you could use something like this:

SELECT a.ColumnId
      ,a.InTime
      ,a.OutTime
      ,b.ColumnId AS OverlappingId
      ,b.InTime   AS OverlappingInTime
      ,b.OutTime  AS OverlappingOutTime
  FROM TimeTable  AS a
  JOIN TimeTable  AS b ON ((a.InTime  BETWEEN b.InTime AND b.OutTime)
                       OR  (a.OutTime BETWEEN b.InTime AND b.OutTime) 
                       OR  (a.InTime < b.InTime AND a.OutIme > b.OutTime))
                      AND  (a.ColumnId != b.ColumnId)

But I'm really not sure about the performance this query would have in a table with millions of records as you mention.

Edited to Add, and edited yet again:

After the comments of Vadim K., I noticed that the query I had wrote previously were missing a case when the overlapping were total, that is one range cover the entirely another one. Above is my revised query, and below the original one:

SELECT a.ColumnId 
      ,a.InTime 
      ,a.OutTime 
      ,b.ColumnId AS OverlappingId 
      ,b.InTime   AS OverlappingInTime 
      ,b.OutTime  AS OverlappingOutTime 
  FROM TimeTable  AS a 
  JOIN TimeTable  AS b ON ((a.InTime  BETWEEN b.InTime AND b.OutTime) 
                       OR  (a.OutTime BETWEEN b.InTime AND b.OutTime)) 
                      AND  (a.ColumnId != b.ColumnId) 

Using the question initial data for the a test run:

+--------+------------------+------------------+
|ColumnId| InTime           | OutTime          |
+--------+------------------+------------------+
|      1 | 01/02/2009 10:00 | 01/02/2009 20:00 |   
|      2 | 01/02/2009  2:00 | 01/02/2009  2:00 |  
|      3 | 01/02/2009  2:00 | 01/02/2009  5:00 |  
|      4 | 01/02/2009  3:03 | 01/02/2009  5:00 |  
|      5 | 01/02/2009 10:00 | 01/02/2009 22:00 |  
|      6 | 01/02/2009  3:00 | 01/02/2009  4:00 |  
+--------+------------------+------------------+

Running the original query we have the following result:

+--------+------------------+------------------+-------------+
|ColumnId| InTime           | OutTime          |OverlappingId|
+--------+------------------+------------------+-------------+
|      1 | 01/02/2009 10:00 | 01/02/2009 20:00 |           5 |
|      2 | 01/02/2009  2:00 | 01/02/2009  2:00 |           3 |
|      3 | 01/02/2009  2:00 | 01/02/2009  5:00 |           2 |
|      3 | 01/02/2009  2:00 | 01/02/2009  5:00 |           4 |
|      4 | 01/02/2009  3:03 | 01/02/2009  5:00 |           3 |
|      4 | 01/02/2009  3:03 | 01/02/2009  5:00 |           6 |
|      5 | 01/02/2009 10:00 | 01/02/2009 22:00 |           1 |
|      6 | 01/02/2009  3:00 | 01/02/2009  4:00 |           3 |
|      6 | 01/02/2009  3:00 | 01/02/2009  4:00 |           4 |
+--------+------------------+------------------+-------------+

Running the updated query we have the following result:

+--------+------------------+------------------+-------------+
|ColumnId| InTime           | OutTime          |OverlappingId|
+--------+------------------+------------------+-------------+
|      1 | 01/02/2009 10:00 | 01/02/2009 20:00 |           5 |
|      2 | 01/02/2009  2:00 | 01/02/2009  2:00 |           3 |
|      3 | 01/02/2009  2:00 | 01/02/2009  5:00 |           2 |
|      3 | 01/02/2009  2:00 | 01/02/2009  5:00 |           4 |
|      3 | 01/02/2009  2:00 | 01/02/2009  5:00 |           6 | << missing row
|      4 | 01/02/2009  3:03 | 01/02/2009  5:00 |           3 |
|      4 | 01/02/2009  3:03 | 01/02/2009  5:00 |           6 |
|      5 | 01/02/2009 10:00 | 01/02/2009 22:00 |           1 |
|      6 | 01/02/2009  3:00 | 01/02/2009  4:00 |           3 |
|      6 | 01/02/2009  3:00 | 01/02/2009  4:00 |           4 |
+--------+------------------+------------------+-------------+

Yes, there are some IDs that are repeated, but that's because they overlaps with different records.

The question also asks for the number of overlapping rows. I'm not sure, and the question isn't clear enough, if it wants the number of overlapping rows regarding the original table.

Some people have suggested using the a.ColumnId < b.ColumnId or a.ColumnId > b.ColumnId in order to avoid repetition, however, it still doesn't work because if we did the first comparison we'd get the following result:

+--------+------------------+------------------+-------------+
|ColumnId| InTime           | OutTime          |OverlappingId|
+--------+------------------+------------------+-------------+
|      1 | 01/02/2009 10:00 | 01/02/2009 20:00 |           5 |
|      2 | 01/02/2009  2:00 | 01/02/2009  2:00 |           3 |
|      3 | 01/02/2009  2:00 | 01/02/2009  5:00 |           4 |
|      3 | 01/02/2009  2:00 | 01/02/2009  5:00 |           6 | 
|      4 | 01/02/2009  3:03 | 01/02/2009  5:00 |           6 |
+--------+------------------+------------------+-------------+

If you notice all the 6 rows of the sample data are referenced in the results, although it has only 5 lines. I believe that, with this data, where all the rows are overlapping each other at one point or another, the number of overlapping rows is 6.

And in order to get this result, the query below could be used:

SELECT COUNT (DISTINCT a.ColumnId)
  FROM TimeTable  AS a
  JOIN TimeTable  AS b ON ((a.InTime  BETWEEN b.InTime AND b.OutTime)
                       OR  (a.OutTime BETWEEN b.InTime AND b.OutTime) 
                       OR  (a.InTime < b.InTime AND a.OutIme > b.OutTime))
                      AND  (a.ColumnId != b.ColumnId)

Which returns the count of all 6 rows.

流绪微梦 2024-08-24 13:03:39

仔细测试解决方案,我发现到目前为止发布的答案要么重叠检查错误,要么返回太多结果(每个重叠有两行)。

select
    aa.ColumnId as ColumnIdA, aa.InTime as InTimeA, aa.OutTime as OutTimeA,
    bb.ColumnId as ColumnIdB, bb.InTime as InTimeB, bb.OutTime as OutTimeB
from
    MyTable aa
    join
    MyTable bb on aa.ColumnId < bb.ColumnId
where
    aa.InTime < bb.OutTime
    and
    aa.OutTime > bb.InTime

在定义“重叠”时必须小心。我假设如果第一个时间段是凌晨 3 点到凌晨 4 点,第二个时间段是凌晨 4 点到凌晨 5 点,那么这些范围不会重叠。如果确实希望这种情况被视为重叠,请将 <- 更改为 -<= 并将 >- 更改为- >=where 子句中。

性能与行数的平方成正比。对于大型数据集,更快的解决方案是可能的,但比这涉及更多。

Test the solutions carefully, I've found that the answers posted so far either get the overlap check wrong or return too many results (two rows for each overlap).

select
    aa.ColumnId as ColumnIdA, aa.InTime as InTimeA, aa.OutTime as OutTimeA,
    bb.ColumnId as ColumnIdB, bb.InTime as InTimeB, bb.OutTime as OutTimeB
from
    MyTable aa
    join
    MyTable bb on aa.ColumnId < bb.ColumnId
where
    aa.InTime < bb.OutTime
    and
    aa.OutTime > bb.InTime

One must be careful in defining "overlap". I assume that if the first period is 3am-to-4am and the second period is 4am-to-5am that these ranges don't overlap. If one truly wishes this case to be considered an overlap, change the <-to-<= and the >-to->= in the where clause.

Performance is proportional to the square of the number of rows. Faster solutions are possible for large data sets, but are much more involved than this one.

你与昨日 2024-08-24 13:03:39
SELECT  T1. ColumnId, T1.Intime, T1.OutTime
FROM    T1, T2
WHERE    1 =1 
AND (    T2.Intime  BETWEEN T1.Intime AND T1.OutTime
      OR T2.OutTime BETWEEN T1.Intime AND T1.OutTime )
        AND T1.ColumnId <> T2.ColumnId
SELECT  T1. ColumnId, T1.Intime, T1.OutTime
FROM    T1, T2
WHERE    1 =1 
AND (    T2.Intime  BETWEEN T1.Intime AND T1.OutTime
      OR T2.OutTime BETWEEN T1.Intime AND T1.OutTime )
        AND T1.ColumnId <> T2.ColumnId
这个俗人 2024-08-24 13:03:39
Select T1.*,T2.*
From Table1 T1
Inner Join Table1 T2 ON ((T1.InTime >= T2.InTime AND T1.OutTime > T2.InTime)
                         OR (T2.InTime >= T1.InTime AND T2.OutTime > T1.InTime)) 
                        AND (T1.ColumnId != T2.ColumnId)
Select T1.*,T2.*
From Table1 T1
Inner Join Table1 T2 ON ((T1.InTime >= T2.InTime AND T1.OutTime > T2.InTime)
                         OR (T2.InTime >= T1.InTime AND T2.OutTime > T1.InTime)) 
                        AND (T1.ColumnId != T2.ColumnId)
地狱即天堂 2024-08-24 13:03:39

如果 SQL 符合 ansi 2003,则可以使用 OVERLAPS 函数。
注意 t1.c1 < t2.c1 以避免重复。

SEL *
FROM TimeTable AS t1,TimeTable AS t2
WHERE (t1.Intime,t1.Outtime) OVERLAPS (t2.Intime,t2.Outtime)
AND t1.ColumnId < t2.ColumnId
ORDER BY 1;

You can use the OVERLAPS function if the SQL is ansi 2003 compliant.
note the t1.c1 < t2.c1 to avoid duplicates.

SEL *
FROM TimeTable AS t1,TimeTable AS t2
WHERE (t1.Intime,t1.Outtime) OVERLAPS (t2.Intime,t2.Outtime)
AND t1.ColumnId < t2.ColumnId
ORDER BY 1;
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文