查询最大并发时间跨度数
我有一个包含两个日期时间字段的 SQL Server 表(CnxStartdatetime
、CnxEnddatetime
)。每行代表一次信息的传输。 我试图根据这两个时间戳找到最大并发传输数。我有一个有效的查询,但它既慢又极其麻烦。我知道一定有更好的方法来解决这个问题,但无法想出任何方法。
对于当前版本,如果我使用 5 个“级别”运行它并获得结果,我必须返回并添加大量 SQL 来测试是否存在 6 个并发传输的实例等。一旦查询获得 7-8 个“级别” “深了它就变得非常慢。
当前版本的片段:
select
t1.id, t2.id, t3.id, t4.id, t5.id, t6.id, t7.id, t8.id, t9.id, t10.id
FROM
dbo.MyTable t1, dbo.MyTable t2, dbo.MyTable t3, dbo.MyTable t4, dbo.MyTable t5,
dbo.MyTable t6, dbo.MyTable t7, dbo.MyTable t8, dbo.MyTable t9, dbo.MyTable t10
WHERE
(((t2.cnxstartdatetime >= t1.cnxstartdatetime) and (t2.cnxstartdatetime <= t1.cnxenddatetime))
or ((t2.cnxenddatetime >= t1.cnxstartdatetime) and (t2.cnxenddatetime <= t1.cnxenddatetime)))
AND
t2.id != t1.id
AND
(((t3.cnxstartdatetime >= t2.cnxstartdatetime) and (t3.cnxstartdatetime >= t1.cnxstartdatetime)and (t3.cnxstartdatetime <= t1.cnxenddatetime) and (t3.cnxstartdatetime <= t2.cnxenddatetime))
or ((t3.cnxenddatetime >= t2.cnxstartdatetime) and (t3.cnxenddatetime >= t1.cnxstartdatetime)and (t3.cnxenddatetime <= t1.cnxenddatetime) and (t3.cnxenddatetime <= t2.cnxenddatetime)))
AND
t3.id != t2.id AND t3.id != t1.id
AND
(((t4.cnxstartdatetime >= t3.cnxstartdatetime) and (t4.cnxstartdatetime >= t1.cnxstartdatetime)and (t4.cnxstartdatetime >= t2.cnxstartdatetime) and (t4.cnxstartdatetime <= t1.cnxenddatetime) and (t4.cnxstartdatetime <= t3.cnxenddatetime)and (t4.cnxstartdatetime <= t2.cnxenddatetime))
or ((t4.cnxenddatetime >= t3.cnxstartdatetime) and (t4.cnxenddatetime >= t1.cnxstartdatetime)and (t4.cnxenddatetime >= t2.cnxstartdatetime) and (t4.cnxenddatetime <= t1.cnxenddatetime)and (t4.cnxenddatetime <= t3.cnxenddatetime)and (t4.cnxenddatetime <= t2.cnxenddatetime)))
AND
t4.id != t3.id AND t4.id != t2.id AND t4.id != t1.id
... *snip*
编辑 许多回复都建议我使用交叉联接
。这没有达到我想要的结果。下面是一个记录“重叠”的交叉联接
结果示例。这是它为我提供的 ID 11787
的列表 如您所见,11781
确实不与 11774
重叠只是时间跨度与 11787
相交的任何记录的列表。
11774 2011-04-29 01:02:56.780 2011-04-29 01:02:58.793
11777 2011-04-29 01:02:56.780 2011-04-29 01:02:58.843
11778 2011-04-29 01:02:56.780 2011-04-29 01:02:58.950
11775 2011-04-29 01:02:56.793 2011-04-29 01:02:58.843
11776 2011-04-29 01:02:56.793 2011-04-29 01:02:58.890
11780 2011-04-29 01:02:58.310 2011-04-29 01:03:02.687
11779 2011-04-29 01:02:58.327 2011-04-29 01:03:02.543
11787 2011-04-29 01:02:58.530 2011-04-29 01:03:08.827 **
11781 2011-04-29 01:02:59.030 2011-04-29 01:03:05.187
11782 2011-04-29 01:02:59.247 2011-04-29 01:03:05.467
11784 2011-04-29 01:02:59.293 2011-04-29 01:03:05.810
11791 2011-04-29 01:03:00.107 2011-04-29 01:03:13.623
11786 2011-04-29 01:03:00.843 2011-04-29 01:03:08.983
11783 2011-04-29 01:03:02.560 2011-04-29 01:03:05.793
11785 2011-04-29 01:03:02.717 2011-04-29 01:03:07.357
11790 2011-04-29 01:03:05.200 2011-04-29 01:03:14.153
11804 2011-04-29 01:03:05.687 2011-04-29 01:03:25.577
11811 2011-04-29 01:03:07.093 2011-04-29 01:03:35.153
11799 2011-04-29 01:03:07.123 2011-04-29 01:03:24.437
11789 2011-04-29 01:03:08.793 2011-04-29 01:03:13.577
我也尝试过使用递归编写 CTE,但我不知道如何确保当前的 ID
与当前并发堆栈中任何先前的 ID
都不匹配。下面的代码会自我递归,直到达到极限。
WITH TransmissionConcurrency (StartTime, EndTime, ConcurrencyLevel) AS
(
SELECT
CnxStartDatetime AS StartTime,
CnxEndDatetime AS EndTime,
1 AS ConcurrencyLevel
FROM dbo.MyTable
UNION ALL
SELECT
CASE WHEN d.CnxStartDatetime > tc.StartTime THEN d.CnxStartDatetime ELSE tc.StartTime END AS StartTime,
CASE WHEN d.CnxEndDatetime < tc.EndTime THEN d.CnxEndDatetime ELSE tc.EndTime END AS EndDate,
tc.ConcurrencyLevel + 1 as ConcurrencyLevel
FROM dbo.MyTable d
INNER JOIN TransmissionConcurrency tc ON
((d.CnxStartDatetime between tc.StartTime and tc.EndTime)
or
(d.CnxEndDatetime between tc.StartTime and tc.EndTime)
or
(d.CnxStartDatetime <= tc.StartTime and d.CnxEndDatetime >= tc.EndTime))
)
SELECT *
FROM TransmissionConcurrency
ORDER BY ConcurrencyLevel, StartTime, EndTime
我想出了下图来尝试更好地解释我正在寻找的内容。
A [--------]
B [-----]
C [------]
D [---]
E [---]
F [-]
在本例中,cross join
方法会告诉我与 A
的最大并发数是 6(A
与 B、C、 D、E 和 F
)我正在寻找的最大并发数为 3(A
与 B,F
or A
与 C,E
)
I have a SQL Server table with two datetime fields (CnxStartdatetime
, CnxEnddatetime
). Each row represents a transmission of information. I am trying to find the maximum number of concurrent transmissions based on these two timestamps. I have a working query but it is both slow and extremely cumbersome. I know there must be a better way to go about this but can't come up with any.
For the current version, if I run it with 5 "levels" and get results I have to go back and add a ton of SQL to test if there are instances of 6 concurrent transmissions, etc. Once the query gets 7-8 "levels" deep it becomes very slow.
Snippet of current version:
select
t1.id, t2.id, t3.id, t4.id, t5.id, t6.id, t7.id, t8.id, t9.id, t10.id
FROM
dbo.MyTable t1, dbo.MyTable t2, dbo.MyTable t3, dbo.MyTable t4, dbo.MyTable t5,
dbo.MyTable t6, dbo.MyTable t7, dbo.MyTable t8, dbo.MyTable t9, dbo.MyTable t10
WHERE
(((t2.cnxstartdatetime >= t1.cnxstartdatetime) and (t2.cnxstartdatetime <= t1.cnxenddatetime))
or ((t2.cnxenddatetime >= t1.cnxstartdatetime) and (t2.cnxenddatetime <= t1.cnxenddatetime)))
AND
t2.id != t1.id
AND
(((t3.cnxstartdatetime >= t2.cnxstartdatetime) and (t3.cnxstartdatetime >= t1.cnxstartdatetime)and (t3.cnxstartdatetime <= t1.cnxenddatetime) and (t3.cnxstartdatetime <= t2.cnxenddatetime))
or ((t3.cnxenddatetime >= t2.cnxstartdatetime) and (t3.cnxenddatetime >= t1.cnxstartdatetime)and (t3.cnxenddatetime <= t1.cnxenddatetime) and (t3.cnxenddatetime <= t2.cnxenddatetime)))
AND
t3.id != t2.id AND t3.id != t1.id
AND
(((t4.cnxstartdatetime >= t3.cnxstartdatetime) and (t4.cnxstartdatetime >= t1.cnxstartdatetime)and (t4.cnxstartdatetime >= t2.cnxstartdatetime) and (t4.cnxstartdatetime <= t1.cnxenddatetime) and (t4.cnxstartdatetime <= t3.cnxenddatetime)and (t4.cnxstartdatetime <= t2.cnxenddatetime))
or ((t4.cnxenddatetime >= t3.cnxstartdatetime) and (t4.cnxenddatetime >= t1.cnxstartdatetime)and (t4.cnxenddatetime >= t2.cnxstartdatetime) and (t4.cnxenddatetime <= t1.cnxenddatetime)and (t4.cnxenddatetime <= t3.cnxenddatetime)and (t4.cnxenddatetime <= t2.cnxenddatetime)))
AND
t4.id != t3.id AND t4.id != t2.id AND t4.id != t1.id
... *snip*
Edit Many of the responses are suggesting I use a cross join
. This does not achieve the results I am looking for. Here's an example of the results of the cross join
for one record's "overlaps." This is the list that it gives me for ID 11787
As you can see, 11781
does not overlap 11774
This is simply a list of any record whose time span intersects 11787
11774 2011-04-29 01:02:56.780 2011-04-29 01:02:58.793
11777 2011-04-29 01:02:56.780 2011-04-29 01:02:58.843
11778 2011-04-29 01:02:56.780 2011-04-29 01:02:58.950
11775 2011-04-29 01:02:56.793 2011-04-29 01:02:58.843
11776 2011-04-29 01:02:56.793 2011-04-29 01:02:58.890
11780 2011-04-29 01:02:58.310 2011-04-29 01:03:02.687
11779 2011-04-29 01:02:58.327 2011-04-29 01:03:02.543
11787 2011-04-29 01:02:58.530 2011-04-29 01:03:08.827 **
11781 2011-04-29 01:02:59.030 2011-04-29 01:03:05.187
11782 2011-04-29 01:02:59.247 2011-04-29 01:03:05.467
11784 2011-04-29 01:02:59.293 2011-04-29 01:03:05.810
11791 2011-04-29 01:03:00.107 2011-04-29 01:03:13.623
11786 2011-04-29 01:03:00.843 2011-04-29 01:03:08.983
11783 2011-04-29 01:03:02.560 2011-04-29 01:03:05.793
11785 2011-04-29 01:03:02.717 2011-04-29 01:03:07.357
11790 2011-04-29 01:03:05.200 2011-04-29 01:03:14.153
11804 2011-04-29 01:03:05.687 2011-04-29 01:03:25.577
11811 2011-04-29 01:03:07.093 2011-04-29 01:03:35.153
11799 2011-04-29 01:03:07.123 2011-04-29 01:03:24.437
11789 2011-04-29 01:03:08.793 2011-04-29 01:03:13.577
I've also attempted writing a CTE with recursion but I can't figure out how to insure the current ID
doesn't match any previous ID
in the current stack of concurrency. The below just recurses upon itself until it hits the limit.
WITH TransmissionConcurrency (StartTime, EndTime, ConcurrencyLevel) AS
(
SELECT
CnxStartDatetime AS StartTime,
CnxEndDatetime AS EndTime,
1 AS ConcurrencyLevel
FROM dbo.MyTable
UNION ALL
SELECT
CASE WHEN d.CnxStartDatetime > tc.StartTime THEN d.CnxStartDatetime ELSE tc.StartTime END AS StartTime,
CASE WHEN d.CnxEndDatetime < tc.EndTime THEN d.CnxEndDatetime ELSE tc.EndTime END AS EndDate,
tc.ConcurrencyLevel + 1 as ConcurrencyLevel
FROM dbo.MyTable d
INNER JOIN TransmissionConcurrency tc ON
((d.CnxStartDatetime between tc.StartTime and tc.EndTime)
or
(d.CnxEndDatetime between tc.StartTime and tc.EndTime)
or
(d.CnxStartDatetime <= tc.StartTime and d.CnxEndDatetime >= tc.EndTime))
)
SELECT *
FROM TransmissionConcurrency
ORDER BY ConcurrencyLevel, StartTime, EndTime
I've come up with the below diagram to try to better explain what I'm looking for.
A [--------]
B [-----]
C [------]
D [---]
E [---]
F [-]
In this instance, the cross join
methods would tell me that the maximum concurrency with A
is 6 (A
with B, C, D, E and F
) What I'm looking for would be a max concurrency of 3 (A
with B,F
or A
with C,E
)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
杰夫.我曾经写过一个类似的查询 - 但在 Oracle 中 - 不确定这是否适用于 SQL-Server,但值得尝试:也许它会给你一些想法:
总体思路是我遍历 < 之间的所有请求code>:aStartWith 日期和
:aEndWith
日期为给定时间段内开始的每个请求分配 +1 权重部分,为结束于 的每个请求分配 -1同一时期。在这里,我假设请求不再是 10 分钟(
其中 trunc(req_recieved, 'mi') 位于 :aStartWith - Interval '10' 分钟和 :aEndWith + Interval '10' 分钟之间
);和 select ... from Dual 是边界条件。然后使用分析函数找到请求的结束时间(
lead(t.time) over (order by t.time, t.weight desc) as e
)并对当前请求的权重求和- 这将给出从时间b
开始到时间e
结束的多个请求 (sum(t.weight) over (order by t.time, t .weight desc) 为cnt
)。要找到最大请求数,您只需用所需的评估包装此查询即可。
您可以尝试一下这个场景是否适合您吗?希望如此:)
Jeff. I've written a similar query once - but in Oracle - not sure whether this will work in SQL-Server, but it's worth trying: maybe it'll give you some idea:
The general idea is that I go through all requests between
:aStartWith
date and:aEndWith
date assigning +1 weight portion to every request that starts in the given period and -1 to every request that end in the same period.Here I assume that requests are no longer that 10 minutes (
where trunc(req_recieved, 'mi') between :aStartWith - interval '10' minute and :aEndWith + interval '10' minute
); andselect ... from dual
are boundary conditions.Then with analytic functions I find the end time of the request (
lead(t.time) over (order by t.time, t.weight desc) as e
) and sum the weights for the current request - this will give a number of requests starting at timeb
and ending at timee
(sum(t.weight) over (order by t.time, t.weight desc) as cnt
).To find maximum number of requests you can just wrap this query with desired evaluations.
Could you please try if this scenario works for you? Hope it does :)
结果
如果您想要涉及的行,您可以使用以下内容:
结果:
Result
If you want the rows that is the involved you can use this:
Result:
它是报告解决方案而不是“标准”数据库查询。最好的选择是在每笔交易开始时在某处写入交易数量)。所有其他解决方案都会很慢。但如果你真的需要这个......
最简单的解决方案是将时间段分成小部分(例如天)并分析每个时间段的计数。这是一个示例:
您可以修改最后一条语句以获取所需的信息(最大加载期间的交易等)
It is rather reporting solution than "standard" database query. Best option for this is to write somewhere transactions quantity at start of every transaction). All other solutions will be slow. But if you really need this...
Simplest solution is to split time period for small parts (eg days) and analyse counts in every piece of period. Here is an example:
You can modify last statement to get needed information (transactions in maximum loading period etc)
我知道游标不受欢迎,但交叉连接也是如此。对于所提供的示例数据,这将返回
8
。I know cursors are frowned upon, but so are cross joins. This returns
8
for the sample data provided.