如何根据事件的日期、时间和持续时间检查 SQL 表中的平均并发事件?

发布于 2024-07-17 05:01:49 字数 3312 浏览 6 评论 0原文

我有一组呼叫详细记录,从这些记录中,我应该确定每个系统每小时的平均并发活动呼叫(精确到一分钟)。 如果我查询晚上 7 点到晚上 8 点,我应该会看到该小时内(对于每个系统)该小时的平均并发调用(平均每分钟的并发调用)。

因此,我需要一种方法来检查 7:00-7:01、7:01-7:02 等的活动呼叫计数,然后对这些数字进行平均。 如果呼叫的时间和持续时间在当前检查的分钟内,则该呼叫被视为活动呼叫。

让这变得更加困难的是它需要跨越 SQL 7.0 和 SQL 2000(2000 中的一些函数在 7.0 中不可用,例如 GetUTCTime()),如果我能让 2000 工作我会很高兴。

我可以采取什么方法来解决这个问题?

我考虑过循环检查所检查的小时内的分钟 (60) 并添加该分钟之间的呼叫计数,然后以某种方式交叉引用持续时间以确保在晚上 7:00 开始且持续时间为300 秒显示在 7:04 处于活动状态,但我无法想象如何解决该问题。 我试图找到一种方法来根据特定的分钟对每个呼叫进行加权,以告诉我该呼叫在该分钟内是否处于活动状态,但无法提出有效的解决方案。

这里的数据类型与我要查询的数据类型相同。 我对架构没有任何控制权(除了可能转换数据并插入到具有更合适数据类型的另一个表中)。 我提供了一些我知道有并发活动呼叫的示例数据。

CREATE TABLE Records(
  seconds char(10),
  time char(4),
  date char(8),
  dur int,
  system int,
  port int,
)

--seconds is an stime value. It's the difference of seconds from UTC 1/1/1970 00:00:00 to the current UTC time, we use it as an identifier (like epoch).
--time is the time the call was made.
--date is the day the call was made.
--dur is the duration of the call in seconds.
--system is the system number.
--port is the port on the system (not particularly relevant for this question).

INSERT INTO Records(seconds, time, date, dur, system, port) VALUES('1239924228','1923','20090416',105,2,2)
INSERT INTO Records(seconds, time, date, dur, system, port) VALUES('1239923455','1910','20090416',884,1,97)
INSERT INTO Records(seconds, time, date, dur, system, port) VALUES('1239924221','1923','20090416',116,2,15)
INSERT INTO Records(seconds, time, date, dur, system, port) VALUES('1239924259','1924','20090416',90,1,102)
INSERT INTO Records(seconds, time, date, dur, system, port) VALUES('1239923458','1910','20090416',891,2,1)
INSERT INTO Records(seconds, time, date, dur, system, port) VALUES('1239924255','1924','20090416',99,2,42)
INSERT INTO Records(seconds, time, date, dur, system, port) VALUES('1239924336','1925','20090416',20,2,58)
INSERT INTO Records(seconds, time, date, dur, system, port) VALUES('1239924293','1924','20090416',64,2,41)
INSERT INTO Records(seconds, time, date, dur, system, port) VALUES('1239923472','1911','20090416',888,2,27)
INSERT INTO Records(seconds, time, date, dur, system, port) VALUES('1239924347','1925','20090416',25,1,100)
INSERT INTO Records(seconds, time, date, dur, system, port) VALUES('1239924301','1925','20090416',77,2,55)
INSERT INTO Records(seconds, time, date, dur, system, port) VALUES('1239924332','1925','20090416',52,2,43)
INSERT INTO Records(seconds, time, date, dur, system, port) VALUES('1239924240','1924','20090416',151,1,17)
INSERT INTO Records(seconds, time, date, dur, system, port) VALUES('1239924313','1925','20090416',96,2,62)
INSERT INTO Records(seconds, time, date, dur, system, port) VALUES('1239924094','1921','20090416',315,2,16)
INSERT INTO Records(seconds, time, date, dur, system, port) VALUES('1239923643','1914','20090416',788,2,34)
INSERT INTO Records(seconds, time, date, dur, system, port) VALUES('1239924447','1927','20090416',6,2,27)
INSERT INTO Records(seconds, time, date, dur, system, port) VALUES('1239924342','1925','20090416',119,2,15)
INSERT INTO Records(seconds, time, date, dur, system, port) VALUES('1239924397','1926','20090416',76,2,41)
INSERT INTO Records(seconds, time, date, dur, system, port) VALUES('1239924457','1927','20090416',23,2,27)

I have a set of call detail records, and from those records, I'm supposed to determine the average concurrent active calls per system, per hour (at a precision of one minute). If I query 7pm to 8pm, I should see the average concurrent calls for the hour (averaging the concurrent calls for each minute) within that hour (for each system).

So, I need a way to check for a count of active calls for 7:00-7:01, 7:01-7:02, etc then average those numbers. A call is considered active if the call's time and duration fall within the current minute being checked.

What makes this even more difficult is that it needs to span SQL 7.0 and SQL 2000 (some functions in 2000 aren't available in 7.0, such as GetUTCTime()), if I can just get 2000 working I'll be happy.

What approaches to this problem can I take?

I thought about looping through minutes (60) in the hour being checked and adding the count of calls that fall between that minute and then somehow cross referencing the duration to make sure that a call that starts at 7:00 pm and has a duration of 300 seconds shows active at 7:04, but I can't imagine how to approach the problem. I tried to figure out a way to weight each call against particular minute that would tell me if the call was active during that minute or not, but couldn't come up with an effective solution.

The data types here are the same as I have to query against. I don't have any control over the schema (other than possibly converting the data and inserting into another table with more appropriate data types). I've provided some example data that I know has concurrent active calls.

CREATE TABLE Records(
  seconds char(10),
  time char(4),
  date char(8),
  dur int,
  system int,
  port int,
)

--seconds is an stime value. It's the difference of seconds from UTC 1/1/1970 00:00:00 to the current UTC time, we use it as an identifier (like epoch).
--time is the time the call was made.
--date is the day the call was made.
--dur is the duration of the call in seconds.
--system is the system number.
--port is the port on the system (not particularly relevant for this question).

INSERT INTO Records(seconds, time, date, dur, system, port) VALUES('1239924228','1923','20090416',105,2,2)
INSERT INTO Records(seconds, time, date, dur, system, port) VALUES('1239923455','1910','20090416',884,1,97)
INSERT INTO Records(seconds, time, date, dur, system, port) VALUES('1239924221','1923','20090416',116,2,15)
INSERT INTO Records(seconds, time, date, dur, system, port) VALUES('1239924259','1924','20090416',90,1,102)
INSERT INTO Records(seconds, time, date, dur, system, port) VALUES('1239923458','1910','20090416',891,2,1)
INSERT INTO Records(seconds, time, date, dur, system, port) VALUES('1239924255','1924','20090416',99,2,42)
INSERT INTO Records(seconds, time, date, dur, system, port) VALUES('1239924336','1925','20090416',20,2,58)
INSERT INTO Records(seconds, time, date, dur, system, port) VALUES('1239924293','1924','20090416',64,2,41)
INSERT INTO Records(seconds, time, date, dur, system, port) VALUES('1239923472','1911','20090416',888,2,27)
INSERT INTO Records(seconds, time, date, dur, system, port) VALUES('1239924347','1925','20090416',25,1,100)
INSERT INTO Records(seconds, time, date, dur, system, port) VALUES('1239924301','1925','20090416',77,2,55)
INSERT INTO Records(seconds, time, date, dur, system, port) VALUES('1239924332','1925','20090416',52,2,43)
INSERT INTO Records(seconds, time, date, dur, system, port) VALUES('1239924240','1924','20090416',151,1,17)
INSERT INTO Records(seconds, time, date, dur, system, port) VALUES('1239924313','1925','20090416',96,2,62)
INSERT INTO Records(seconds, time, date, dur, system, port) VALUES('1239924094','1921','20090416',315,2,16)
INSERT INTO Records(seconds, time, date, dur, system, port) VALUES('1239923643','1914','20090416',788,2,34)
INSERT INTO Records(seconds, time, date, dur, system, port) VALUES('1239924447','1927','20090416',6,2,27)
INSERT INTO Records(seconds, time, date, dur, system, port) VALUES('1239924342','1925','20090416',119,2,15)
INSERT INTO Records(seconds, time, date, dur, system, port) VALUES('1239924397','1926','20090416',76,2,41)
INSERT INTO Records(seconds, time, date, dur, system, port) VALUES('1239924457','1927','20090416',23,2,27)

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

话少情深 2024-07-24 05:01:49

我认为 MarkusQ 已经找到了答案,但让我开发一个您可能会发现更容易使用的替代方案。 我将使用我的惯用方法 将其开发为视图中的一系列简单转换,类似于 过程语言中的功能分解

首先,让我们将所有内容放在通用单位中。 回想一下,record 的列 s 是自 1970 年 1 月 1 日午夜 纪元 以来的秒数。我们可以找到自 1970 年 1 月 1 日午夜以来的秒数。调用当天,调用发生,只需对一天中的秒数取 s 模:s % (60 * 60 * 24)

select *, 
s % (60 * 60 * 24) as start_secs_from_midnight,
s % (60 * 60 * 24) + dur - 1 as end_secs_from_midnight,
;

我们从 s + dur 中减一,因为从 12:00:00 开始的一秒调用也会在 12:00:00 结束。

我们可以通过将这些结果除以 60 或仅除以 floor( s / 60 ) % (60 * 24) 来找到自午夜以来的分钟数:

create view record_mins_from_midnight as
select *, 
floor( s / 60 ) % (60 * 24) as start_mins_fm,
floor( ( s + dur - 1) / 60 ) % (60 * 24) as end_mins_fm 
from record
;

现在我们创建一个分钟表。 我们需要 1440 个,编号从 0 到 1439。在不支持任意序列的数据库中,我创建一个人工范围或序列,如下所示:

  create table artificial_range ( 
   id int not null primary key auto_increment, idz int) ;
  insert into artificial_range(idz) values (0);
  -- repeat next line to double rows
  insert into artificial_range(idz) select idz from artificial_range;

因此要创建一个分钟 表:

  create view minute as 
   select id - 1 as active_minute 
   from artificial_range 
   where id <= 1440
   ;

现在我们只需将分钟加入到我们的记录视图中,

create view record_active_minutes as
select * from minutes a 
join record_mins_from_midnight b
on (a.active_minute >= b.start_mins_fm 
and a.active_minute <= b.end_mins_fm 
 ;

这只是对记录行进行交叉乘积/相乘,因此对于呼叫处于活动状态的每一整分钟,我们都有一个记录行。

请注意,我通过将活动定义为“在一分钟内发生的(部分)呼叫”来实现此目的。 也就是说,根据此定义,从 12:00:59 开始到 12:01:01 结束的两秒调用发生在两个不同的分钟内,但从 12:00:58 开始到 12 点结束的两秒调用发生: 00:59 发生在一分钟内。

我这样做是因为您指定了“所以,我需要一种方法来检查 7:00-7:01、7:01-7:02 的活动呼叫计数”。 如果您希望仅考虑持续时间超过 60 秒的呼叫在超过一分钟内发生,则需要调整连接。

现在,如果我们想要查找等于或大于分钟粒度的任何粒度的活动记录数,我们只需对最后一个视图进行分组。 为了找到每小时的平均呼叫次数,我们除以 60,将分钟转换为小时:

 select floor( active_minute / 60 ) as hour, 
 count(*) / 60 as avg_concurent_calls_per_minute_for_hour
 from record_active_minutes
 group by floor( active_minute / 60 ) ;

请注意,这是所有天所有呼叫的每小时平均值; 如果我们想将其限制为特定的一天或几天范围,我们可以添加一个 where 子句。


但是等等,还有更多!

如果我们创建一个执行左外连接的 record_active_months 版本,我们可以获得一个显示一天中所有小时平均值的报告:

 create view record_active_minutes_all as
 select * 
 from 
 minutes a 
 left outer join record_mins_from_midnight b
   on (a.active_minute >= b.start_mins_fm 
       and a.active_minute <= b.end_mins_fm) 
 ;

然后我们再次进行选择,但针对新视图:

 select floor( active_minute / 60 ) as hour, 
 count(*) / 60 as avg_concurent_calls_per_min
 from record_active_minutes_all
 group by floor( active_minute / 60 ) ;


+------+------------------------------+
| hour | avg_concurrent_calls_per_min |
+------+------------------------------+
|    0 |                       0.0000 |
|    1 |                       0.0000 |
|    2 |                       0.0000 |
|    3 |                       0.0000 |
   etc....

我们还可以使用 where 对此进行索引。 不幸的是,连接意味着我们将在底层记录表中使用空值,其中特定小时内不存在调用,例如,

 select floor( active_minute / 60 ) as hour, 
 count(*) / 60 as avg_concurent_calls_per_min
 from record_active_minutes_all
 where month(date) = 1 and year(date) = 2008 
 group by floor( active_minute / 60 ) ;

在没有发生调用的小时内不会返回任何行。 如果我们仍然想要显示所有小时的“类似报告”视图,我们确保还包括那些没有记录的小时:

 select floor( active_minute / 60 ) as hour, 
 count(*) / 60 as avg_concurent_calls_per_minute_for_hour
 from record_active_minutes_all
 where (month(date) = 1 and year(date) = 2008) 
 or date is null 
 group by floor( active_minute / 60 ) ;

请注意,在最后两个示例中,我使用的是 SQL 日期(函数 可以应用),而不是记录表中的char(4)日期。

这就引出了另一点:记录表中的日期和时间都是多余的并且是非规范化的,因为每个日期和时间都可以从您的列中派生。 将它们保留在表中可能会出现不一致的行,其中date(s) <> 日期时间<> 时间。 我更喜欢这样做:

   create table record ( id int not null primary key, s, duration) ; 

   create view record_date as 
   select *, dateadd( ss, s, '1970-01-01') as call_date
   from record
  ;

dateadd函数中,ss是一个枚举类型,告诉函数添加秒数; s 是记录中的列。

I think MarkusQ has the answer, but let me develop an alternative that you may find easier to use. I'll use my customary method of developing this as a series of simple transformations in views, an analogue of functional decomposition in a procedural language.

First, let's put everything in common units. Recall that record's column s is seconds since the epoch, midnight 1 January 1970. We can find the number of seconds since midnight of the day of the call, that call occurred, by just taking s modulus the number of seconds in a day: s % (60 * 60 * 24).

select *, 
s % (60 * 60 * 24) as start_secs_from_midnight,
s % (60 * 60 * 24) + dur - 1 as end_secs_from_midnight,
;

We subtract one from s + dur because a one second call that starts at 12:00:00 also ends on 12:00:00.

We can find minutes since midnight by dividing those results by 60, or just by floor( s / 60 ) % (60 * 24) :

create view record_mins_from_midnight as
select *, 
floor( s / 60 ) % (60 * 24) as start_mins_fm,
floor( ( s + dur - 1) / 60 ) % (60 * 24) as end_mins_fm 
from record
;

Now we create a table of minutes. We need 1440 of them, numbered from 0 to 1439. In databases that don't support arbitrary sequences, I create an artificial range or sequence like this:

  create table artificial_range ( 
   id int not null primary key auto_increment, idz int) ;
  insert into artificial_range(idz) values (0);
  -- repeat next line to double rows
  insert into artificial_range(idz) select idz from artificial_range;

So to create a minute table:

  create view minute as 
   select id - 1 as active_minute 
   from artificial_range 
   where id <= 1440
   ;

Now we just join minute to our record view

create view record_active_minutes as
select * from minutes a 
join record_mins_from_midnight b
on (a.active_minute >= b.start_mins_fm 
and a.active_minute <= b.end_mins_fm 
 ;

This just cross products/multiplies record rows, so we have one record row for each whole minute over which the call was active.

Note that I'm doing this by defining active as "(part of) the call occurred during a minute". That is, a two second call that starts at 12:00:59 and ends at 12:01:01 by this definition occurs during two different minutes, but a two second call that starts at 12:00:58 and ends at 12:00:59 occurs during one minute.

I did that because you specified "So, I need a way to check for a count of active calls for 7:00-7:01, 7:01-7:02". If you prefer to consider only calls lasting more than sixty seconds to occur in more than one minute, you'll need to adjust the join.

Now if we want to find the number of active records for any granularity equal to or larger than minute granularity, we just group on that last view. To find average calls per hour we divide by 60 to turn minutes to hours:

 select floor( active_minute / 60 ) as hour, 
 count(*) / 60 as avg_concurent_calls_per_minute_for_hour
 from record_active_minutes
 group by floor( active_minute / 60 ) ;

Note that that is the average per hour for all calls, over all days; if we want to limit it to a particular day or range of days, we'd add a where clause.


But wait, there's more!

If we create a version of record_active_minutes that does a left outer join, we can get a report that shows the average over all hours in the day:

 create view record_active_minutes_all as
 select * 
 from 
 minutes a 
 left outer join record_mins_from_midnight b
   on (a.active_minute >= b.start_mins_fm 
       and a.active_minute <= b.end_mins_fm) 
 ;

Then we again do our select, but against the new view:

 select floor( active_minute / 60 ) as hour, 
 count(*) / 60 as avg_concurent_calls_per_min
 from record_active_minutes_all
 group by floor( active_minute / 60 ) ;


+------+------------------------------+
| hour | avg_concurrent_calls_per_min |
+------+------------------------------+
|    0 |                       0.0000 |
|    1 |                       0.0000 |
|    2 |                       0.0000 |
|    3 |                       0.0000 |
   etc....

We can also index into this with a where. Unfortunately, the join means we'll have null values for the underlying record table where no calls exist for a particular hour, e.g.,

 select floor( active_minute / 60 ) as hour, 
 count(*) / 60 as avg_concurent_calls_per_min
 from record_active_minutes_all
 where month(date) = 1 and year(date) = 2008 
 group by floor( active_minute / 60 ) ;

will bring back no rows for hours in which no calls occurred. If we still want our "report-like" view that shows all hours, we make sure we also include those hours with no records:

 select floor( active_minute / 60 ) as hour, 
 count(*) / 60 as avg_concurent_calls_per_minute_for_hour
 from record_active_minutes_all
 where (month(date) = 1 and year(date) = 2008) 
 or date is null 
 group by floor( active_minute / 60 ) ;

Note that in the last two examples, I'm using a SQL date (to which the functions month and year can be applied), not the char(4) date in your record table.

Which brings up another point: both the date and time in your record table are superfluous and denormalized, as each can be derived from your column s. Leaving them in the table allows the possibility of inconsistent rows, in which date(s) <> date or time(s) <> time. I'd prefer to do it like this:

   create table record ( id int not null primary key, s, duration) ; 

   create view record_date as 
   select *, dateadd( ss, s, '1970-01-01') as call_date
   from record
  ;

In the dateadd function, the ss is an enumerated type that tells the function to add seconds; s is the column in record.

在风中等你 2024-07-24 05:01:49

如果我理解正确的话,您想要获取开始时间小于 t+60 秒且开始时间加上持续时间小于或等于 t 的所有记录的计数,对于感兴趣的间隔中的每个 t (例如,t=7:00、7:01、7:02...等)。

然后只需对这些计数进行平均即可。

但什么是平均值呢? 这只是总和除以项目数,对吧? 在这种情况下,项目的数量将始终等于以分钟为单位的时间范围,并且总和将等于该间隔内的持续时间分钟的总和,您可以根据给定的数据一次性计算出该值。

现在听起来不太不可能了? 在伪 SQL 中:

select sum( 
     ((time+duration rounded up to next minute, capped at end of period)
    - (time rounded down, bottom-capped at start of period) - 1)
     /(1 minute) )
  from Records
  where date is right

然后将其除以感兴趣的时间段内的分钟数。

If I understand you correctly, you want to get a count of all records for which the start time is less then t+60 seconds and the start time plus the duration is less than or equal to t, for each t in the interval of interest (e.g., t=7:00, 7:01, 7:02...etc.).

Then it's just a matter of averaging these counts.

But what is an average? It's just the sum divided by the number of items, right? In this case, the number of items will always be equal to the time range in minutes, and the sum will be equal to the sum of the durations-minutes that fall within the interval, which you can compute in one go off the data given.

Sound less impossible now? In pseudo SQL:

select sum( 
     ((time+duration rounded up to next minute, capped at end of period)
    - (time rounded down, bottom-capped at start of period) - 1)
     /(1 minute) )
  from Records
  where date is right

Then just divide that by the number of minutes in the period of interest.

肩上的翅膀 2024-07-24 05:01:49

我的第一个建议是,如果您发现自己(在使用 SQL 时)说“我可以创建一个循环...”,那么您应该立即开始寻找基于集合的方法。 使用 SQL 时摆脱过程思维。

您的逻辑仍然有一些模糊的部分。 如果呼叫仅在该分钟内有任何部分的呼叫,是否算作该分钟内的呼叫? 例如,如果一个呼叫从 1923 开始并持续 62 秒,是否会被视为与从 1924 开始的所有呼叫重叠? 我假设这一点是肯定的,但如果情况不正确,您可以调整下面的代码。 这应该是一个小调整。

对于分钟的细分,我通常会使用带有时间跨度的表格 - 我感兴趣的每个切片的开始时间和结束时间。 在您的情况下,由于您正在处理精确的分钟并且您的开始时间以分钟为单位(即使数据类型搞砸了),所以我将使用具有相关精确分钟的单个列。

要设置该表:

CREATE TABLE dbo.Minutes (
    start_time  INT NOT NULL,
    CONSTRAINT PK_Minutes PRIMARY KEY CLUSTERED (start_time)
)

DECLARE
    @hour   TINYINT,
    @minute TINYINT

SET @hour = 19
SET @minute = 0

WHILE (@hour <= 20)
BEGIN
    INSERT INTO dbo.Minutes (start_time) VALUES (@hour * 100 + @minute)

    SET @minute = @minute + 1
    IF @minute = 60
    BEGIN
        SET @minute = 0
        SET @hour = @hour + 1
    END
END

现在我们可以选择平均值等。

SELECT
    M.start_time,
    COUNT(R.seconds)
FROM
    dbo.Minutes M
LEFT OUTER JOIN dbo.Records R ON
    M.start_time BETWEEN CAST(R.time AS INT) AND
        (CAST(SUBSTRING(time, 1, 2) AS INT) * 100) +    -- hours
        (FLOOR((CAST(SUBSTRING(time, 3, 2) AS INT) + FLOOR(dur/60))/60)) +  -- carryover to hours
        (CAST(SUBSTRING(time, 3, 2) AS INT) + dur/60) % 60  -- minutes
GROUP BY
    M.start_time

您需要将其用作子查询来获取给定时间内的平均值。 由于周五已经很晚了,我将把这一步留给您;)

编辑:最后一个警告:我没有考虑跨越一天边界的时间跨度(即,过了午夜)。 希望代码能为您指明正确的方向。 更好的方法可能是创建一个视图,将所有这些令人讨厌的字符串转换为实际的 DATETIME 值,然后这对于 Minutes 表来说变得非常微不足道。

My first bit of advice would be, if you ever find yourself saying (when using SQL) "I can create a loop..." then you should immediately start looking for a set-based approach. Get out of the procedural mindset when using SQL.

There are still a few fuzzy parts to your logic. Does a call count as being during a minute period if it simply has any portion of the call during that minute? For example, if a call starts at 1923 and lasts 62 seconds is it considered to overlap with all calls starting at 1924? I'm going to assume yes on that one, but you can adjust the code below if that's not true. It should be a minor tweak.

For the breakdown to minutes I would usually use a table with time spans - a start time and end time for each of the slices in which I'm interested. In your case since you are dealing with exact minutes and your start times are in minutes (even if the data type is screwed up) I'm just going to use a single column with the exact minute in question.

To set up that table:

CREATE TABLE dbo.Minutes (
    start_time  INT NOT NULL,
    CONSTRAINT PK_Minutes PRIMARY KEY CLUSTERED (start_time)
)

DECLARE
    @hour   TINYINT,
    @minute TINYINT

SET @hour = 19
SET @minute = 0

WHILE (@hour <= 20)
BEGIN
    INSERT INTO dbo.Minutes (start_time) VALUES (@hour * 100 + @minute)

    SET @minute = @minute + 1
    IF @minute = 60
    BEGIN
        SET @minute = 0
        SET @hour = @hour + 1
    END
END

Now we can select for the averages, etc.

SELECT
    M.start_time,
    COUNT(R.seconds)
FROM
    dbo.Minutes M
LEFT OUTER JOIN dbo.Records R ON
    M.start_time BETWEEN CAST(R.time AS INT) AND
        (CAST(SUBSTRING(time, 1, 2) AS INT) * 100) +    -- hours
        (FLOOR((CAST(SUBSTRING(time, 3, 2) AS INT) + FLOOR(dur/60))/60)) +  -- carryover to hours
        (CAST(SUBSTRING(time, 3, 2) AS INT) + dur/60) % 60  -- minutes
GROUP BY
    M.start_time

You'll need to either use that as a subquery to get the averages over a given time. Since it's late on Friday I'll leave that step up to you ;)

EDIT: One last caveat: I didn't account for time spans that cross day boundaries (i.e., go past midnight). Hopefully the code points you in the right direction for that. A better approach might be to create a view that turns all of those nasty strings into actual DATETIME values, then this becomes really trivial with the Minutes table.

鸠魁 2024-07-24 05:01:49

正如 MarkusQ 所说,你对“并发”的定义可以让你简化数学计算。

  • 呼叫 (A) 于“12:00:59”开始,于“12:01:01”结束
  • 呼叫 (B) 于“12:01:59”开始,于“12:02:01”结束
    => “12:00”间隔内有 1 个呼叫
    => “12:01”间隔内有 2 个呼叫
    => “12:02”间隔内有 1 个呼叫

平均并发呼叫数为 (1+2+1)/intervalCount

(1+2+1) 可以以不同的方式计算,并且更快速/更容易:

  • 呼叫 (A) 涵盖 2不同的分钟间隔(12:00 和 12:01)
  • 呼叫 (B) 涵盖 2 个不同的分钟间隔(12:01 和 12:02)
    => 总覆盖分钟数 = 4

这里重要的事实(以及为什么我在 MarkusQ 发布后费心回复)是通话本身的持续时间不足以计算覆盖了多少分钟间隔。 在我的示例中,两个调用仅持续 2 秒...

您需要以下信息:
- “开始时间”,四舍五入到分钟
- “结束时间”,四舍五入到分钟
=> 覆盖间隔 = 分钟差 + 1

要将“时间”字段四舍五入到分钟,我将使用此...

DATEADD(minute, DATEDIFF(minute, 0, time), 0)

因此,单个呼叫覆盖的分钟数将是...

DATEDIFF(
   minute,
   DATEADD(minute, DATEDIFF(minute, 0, time), 0),
   DATEADD(second, dur, time)
) + 1

No need to round the "end time" down.
Using DATEDIFF(minute) gives rounding down anyway.

对该范围值进行求和看看,然后除以该范围内的分钟数,你就得到了答案。

如果您只是寻找真正并发的调用,则无法使用此类技巧,但它仍然是可能的(我不得不做类似的事情)。 但对于并发的定义,这应该可以......

DECLARE
   @date DATETIME, @start DATETIME, @end DATETIME
SELECT
   @date = '2009 Jan 01', @start = '12:00', @end = '13:00'

SELECT
   system,
   SUM(
       DATEDIFF(
          minute,
          CASE WHEN
             CAST(LEFT(time,2) + ':' + RIGHT(time,2) AS DATETIME) < @start
          THEN
             @start
          ELSE
             CAST(LEFT(time,2) + ':' + RIGHT(time,2) AS DATETIME)
          END,
          CASE WHEN
             DATEADD(second, dur, CAST(LEFT(time,2) + ':' + RIGHT(time,2) AS DATETIME)) > @end
          THEN
             @end
          ELSE
             DATEADD(second, dur, CAST(LEFT(time,2) + ':' + RIGHT(time,2) AS DATETIME))
          END
       ) + 1
   )
   /
   CAST(DATEDIFF(minute, @start, @end) AS FLOAT)
FROM
   records
WHERE
   date = @date
   AND CAST(LEFT(time,2) + ':' + RIGHT(time,2) AS DATETIME) >= @start
   AND DATEADD(second, dur, CAST(LEFT(time,2) + ':' + RIGHT(time,2) AS DATETIME)) < @end
GROUP BY
   system

这将故意不包括间隔 13:00->13:01
仅60个“1分钟长间隔”12:00->12:01至12:59->13:00

编辑:

我刚刚注意到您的时间和日期存储为字符串,您需要将它们转换为 DATETIME 才能使我的代码正常工作。

编辑2:

已纠正错误。 如果呼叫在“11:59:01”开始并在“12:00:01”结束,则不应计算“11:59”间隔。 添加 CASE 语句以进行补偿。

各种布局编辑

As MarkusQ stated, your definition of "concurrent" allows you to short cut the maths.

  • Call (A) starts at "12:00:59" and ends at "12:01:01"
  • Call (B) starts at "12:01:59" and ends at "12:02:01"
    => 1 call in the "12:00" interval
    => 2 calls in the "12:01" interval
    => 1 call in the "12:02" interval

The average concurrent calls is then (1+2+1)/intervalCount

The (1+2+1) can be calculated differently, and more quickly/easily:

  • Call (A) covers 2 different minute intervals (12:00 and 12:01)
  • Call (B) covers 2 different minute intervals (12:01 and 12:02)
    => total covered minutes = 4

The important fact here (and why I bothered reply after MarkusQ posted) is that the duration of a call itself isn't enough to calculate how many minute intervals are covered. In my example, both calls only last 2 seconds...

You need the following info:
- the "start time", rounded down to the minute
- the "end time", rounded down to the minute
=> covered intervals = number of minutes difference + 1

To round the "time" field down to the minute I'd use this...

DATEADD(minute, DATEDIFF(minute, 0, time), 0)

So the number of covered minutes by a single call would be...

DATEDIFF(
   minute,
   DATEADD(minute, DATEDIFF(minute, 0, time), 0),
   DATEADD(second, dur, time)
) + 1

No need to round the "end time" down.
Using DATEDIFF(minute) gives rounding down anyway.

SUM that value for the range you're looking at, then divde by the number of minutes in that range, and you have your answer.

If you're only looking for calls that are truely concurrent you can't use such tricks, but it's still possible (I've had to do something similar). But for your definition of concurrent, this should do it...

DECLARE
   @date DATETIME, @start DATETIME, @end DATETIME
SELECT
   @date = '2009 Jan 01', @start = '12:00', @end = '13:00'

SELECT
   system,
   SUM(
       DATEDIFF(
          minute,
          CASE WHEN
             CAST(LEFT(time,2) + ':' + RIGHT(time,2) AS DATETIME) < @start
          THEN
             @start
          ELSE
             CAST(LEFT(time,2) + ':' + RIGHT(time,2) AS DATETIME)
          END,
          CASE WHEN
             DATEADD(second, dur, CAST(LEFT(time,2) + ':' + RIGHT(time,2) AS DATETIME)) > @end
          THEN
             @end
          ELSE
             DATEADD(second, dur, CAST(LEFT(time,2) + ':' + RIGHT(time,2) AS DATETIME))
          END
       ) + 1
   )
   /
   CAST(DATEDIFF(minute, @start, @end) AS FLOAT)
FROM
   records
WHERE
   date = @date
   AND CAST(LEFT(time,2) + ':' + RIGHT(time,2) AS DATETIME) >= @start
   AND DATEADD(second, dur, CAST(LEFT(time,2) + ':' + RIGHT(time,2) AS DATETIME)) < @end
GROUP BY
   system

This will deliberately not include the interval 13:00->13:01
Only the 60 "1 minute long intervals" 12:00->12:01 through to 12:59->13:00

EDIT:

I just noticed that your times and dates are stored as strings, you'd need to convert those to DATETIMEs for my code to work.

EDIT2:

Bug corrected. If a call started at "11:59:01" and ended at "12:00:01", the "11:59" interval should not be counted. CASE statements added to compensate.

Various Layout Edits

另类 2024-07-24 05:01:49

我通过将数据转换为更简单的格式来解决这个问题。 我创建了一个表,其中每行代表一分钟的通话。 一旦你了解了每分钟每小时的平均值就很简单了。 那里有多个选择来显示中间结果。 只要查询的时间范围和持续时间不是很大,应该没问题......?

CREATE TABLE #Records(
  seconds char(10),
  [time] char(4),
  date char(8),
  dur int,
  system int,
  port int
)

/*
seconds is an s[time] value. It's the difference of seconds from UTC 1/1/1970 00:00:00 to the current UTC [time], we use it as an identifier (like epoch).
[time] is the [time] the call was made.
date is the day the call was made.
dur is the duration of the call in seconds.
system is the system number.
port is the port on the system (not particularly relevant for this question).
*/

INSERT INTO #Records(seconds, [time], date, dur, system, port) VALUES('1239924228','1923','20090416',105,2,2)
INSERT INTO #Records(seconds, [time], date, dur, system, port) VALUES('1239923455','1910','20090416',884,1,97)
INSERT INTO #Records(seconds, [time], date, dur, system, port) VALUES('1239924221','1923','20090416',116,2,15)
INSERT INTO #Records(seconds, [time], date, dur, system, port) VALUES('1239924259','1924','20090416',90,1,102)
INSERT INTO #Records(seconds, [time], date, dur, system, port) VALUES('1239923458','1910','20090416',891,2,1)
INSERT INTO #Records(seconds, [time], date, dur, system, port) VALUES('1239924255','1924','20090416',99,2,42)
INSERT INTO #Records(seconds, [time], date, dur, system, port) VALUES('1239924336','1925','20090416',20,2,58)
INSERT INTO #Records(seconds, [time], date, dur, system, port) VALUES('1239924293','1924','20090416',64,2,41)
INSERT INTO #Records(seconds, [time], date, dur, system, port) VALUES('1239923472','1911','20090416',888,2,27)
INSERT INTO #Records(seconds, [time], date, dur, system, port) VALUES('1239924347','1925','20090416',25,1,100)
INSERT INTO #Records(seconds, [time], date, dur, system, port) VALUES('1239924301','1925','20090416',77,2,55)
INSERT INTO #Records(seconds, [time], date, dur, system, port) VALUES('1239924332','1925','20090416',52,2,43)
INSERT INTO #Records(seconds, [time], date, dur, system, port) VALUES('1239924240','1924','20090416',151,1,17)
INSERT INTO #Records(seconds, [time], date, dur, system, port) VALUES('1239924313','1925','20090416',96,2,62)
INSERT INTO #Records(seconds, [time], date, dur, system, port) VALUES('1239924094','1921','20090416',315,2,16)
INSERT INTO #Records(seconds, [time], date, dur, system, port) VALUES('1239923643','1914','20090416',788,2,34)
INSERT INTO #Records(seconds, [time], date, dur, system, port) VALUES('1239924447','1927','20090416',6,2,27)
INSERT INTO #Records(seconds, [time], date, dur, system, port) VALUES('1239924342','1925','20090416',119,2,15)
INSERT INTO #Records(seconds, [time], date, dur, system, port) VALUES('1239924397','1926','20090416',76,2,41)
INSERT INTO #Records(seconds, [time], date, dur, system, port) VALUES('1239924457','1927','20090416',23,2,27)

/* convert date + [time] into datetimes */
select 
    seconds,
    system,
    cast(date + ' ' + left([time], 2) + ':' + right([time], 2) as datetime) as start_date,
    /* end date to the minute */
    dateadd(mi, datediff(mi, 0, dateadd(s, dur, cast(date + ' ' + left([time], 2) + ':' + right([time], 2) as datetime))), 0) as end_date
into 
    #r
from
    #Records

select * from #r order by system, seconds, start_date, end_date;

/* create a row for each minute of each call */
create table #r_min(rnd int, seconds char(10), system int, minute datetime)

declare @maxrnd int;
select @maxrnd = max(datediff(mi, r.start_date, r.end_date)) from #r r
declare @i int;
set @i = 0;

while @i < @maxrnd begin

    insert into #r_min
    select @i, r.seconds, r.system, dateadd(mi, @i, r.start_date)
    from #r r
    where dateadd(mi, @i, r.start_date) <= r.end_date

set @i = @i + 1
end

select * from #r_min order by system, seconds, minute

/* concurrent per minute */
select  
    system, minute, count(*) as cnt
from 
    #r_min 
group by
    system, minute
order by 
    system, minute

/* avg concurrent per minute by hour */
select
    m.system,
    dateadd(hh, datediff(hh, 0, m.minute), 0) as hour,
    avg(m.cnt) as average_concurrent_per_minute
from
    (select  
        system, minute, count(*) as cnt
    from 
        #r_min 
    group by
        system, minute
    ) m
group by
    m.system,
    dateadd(hh, datediff(hh, 0, m.minute), 0)


drop table #Records
drop table #r
drop table #r_min

最后选择的产量...

system  hour    average_concurrent_per_minute
1   2009-04-16 19:00:00.000 1
2   2009-04-16 19:00:00.000 3

I approached the problem by converting the data into an easier format. I created a table where each row represents one minute of a call. Once you have that the average per minute by hour is simple. There are multiple selects in there to show the intermediate results. As long as the time range being queried and the durations aren't extremely large, it should be ok...?

CREATE TABLE #Records(
  seconds char(10),
  [time] char(4),
  date char(8),
  dur int,
  system int,
  port int
)

/*
seconds is an s[time] value. It's the difference of seconds from UTC 1/1/1970 00:00:00 to the current UTC [time], we use it as an identifier (like epoch).
[time] is the [time] the call was made.
date is the day the call was made.
dur is the duration of the call in seconds.
system is the system number.
port is the port on the system (not particularly relevant for this question).
*/

INSERT INTO #Records(seconds, [time], date, dur, system, port) VALUES('1239924228','1923','20090416',105,2,2)
INSERT INTO #Records(seconds, [time], date, dur, system, port) VALUES('1239923455','1910','20090416',884,1,97)
INSERT INTO #Records(seconds, [time], date, dur, system, port) VALUES('1239924221','1923','20090416',116,2,15)
INSERT INTO #Records(seconds, [time], date, dur, system, port) VALUES('1239924259','1924','20090416',90,1,102)
INSERT INTO #Records(seconds, [time], date, dur, system, port) VALUES('1239923458','1910','20090416',891,2,1)
INSERT INTO #Records(seconds, [time], date, dur, system, port) VALUES('1239924255','1924','20090416',99,2,42)
INSERT INTO #Records(seconds, [time], date, dur, system, port) VALUES('1239924336','1925','20090416',20,2,58)
INSERT INTO #Records(seconds, [time], date, dur, system, port) VALUES('1239924293','1924','20090416',64,2,41)
INSERT INTO #Records(seconds, [time], date, dur, system, port) VALUES('1239923472','1911','20090416',888,2,27)
INSERT INTO #Records(seconds, [time], date, dur, system, port) VALUES('1239924347','1925','20090416',25,1,100)
INSERT INTO #Records(seconds, [time], date, dur, system, port) VALUES('1239924301','1925','20090416',77,2,55)
INSERT INTO #Records(seconds, [time], date, dur, system, port) VALUES('1239924332','1925','20090416',52,2,43)
INSERT INTO #Records(seconds, [time], date, dur, system, port) VALUES('1239924240','1924','20090416',151,1,17)
INSERT INTO #Records(seconds, [time], date, dur, system, port) VALUES('1239924313','1925','20090416',96,2,62)
INSERT INTO #Records(seconds, [time], date, dur, system, port) VALUES('1239924094','1921','20090416',315,2,16)
INSERT INTO #Records(seconds, [time], date, dur, system, port) VALUES('1239923643','1914','20090416',788,2,34)
INSERT INTO #Records(seconds, [time], date, dur, system, port) VALUES('1239924447','1927','20090416',6,2,27)
INSERT INTO #Records(seconds, [time], date, dur, system, port) VALUES('1239924342','1925','20090416',119,2,15)
INSERT INTO #Records(seconds, [time], date, dur, system, port) VALUES('1239924397','1926','20090416',76,2,41)
INSERT INTO #Records(seconds, [time], date, dur, system, port) VALUES('1239924457','1927','20090416',23,2,27)

/* convert date + [time] into datetimes */
select 
    seconds,
    system,
    cast(date + ' ' + left([time], 2) + ':' + right([time], 2) as datetime) as start_date,
    /* end date to the minute */
    dateadd(mi, datediff(mi, 0, dateadd(s, dur, cast(date + ' ' + left([time], 2) + ':' + right([time], 2) as datetime))), 0) as end_date
into 
    #r
from
    #Records

select * from #r order by system, seconds, start_date, end_date;

/* create a row for each minute of each call */
create table #r_min(rnd int, seconds char(10), system int, minute datetime)

declare @maxrnd int;
select @maxrnd = max(datediff(mi, r.start_date, r.end_date)) from #r r
declare @i int;
set @i = 0;

while @i < @maxrnd begin

    insert into #r_min
    select @i, r.seconds, r.system, dateadd(mi, @i, r.start_date)
    from #r r
    where dateadd(mi, @i, r.start_date) <= r.end_date

set @i = @i + 1
end

select * from #r_min order by system, seconds, minute

/* concurrent per minute */
select  
    system, minute, count(*) as cnt
from 
    #r_min 
group by
    system, minute
order by 
    system, minute

/* avg concurrent per minute by hour */
select
    m.system,
    dateadd(hh, datediff(hh, 0, m.minute), 0) as hour,
    avg(m.cnt) as average_concurrent_per_minute
from
    (select  
        system, minute, count(*) as cnt
    from 
        #r_min 
    group by
        system, minute
    ) m
group by
    m.system,
    dateadd(hh, datediff(hh, 0, m.minute), 0)


drop table #Records
drop table #r
drop table #r_min

the last select yields...

system  hour    average_concurrent_per_minute
1   2009-04-16 19:00:00.000 1
2   2009-04-16 19:00:00.000 3
べ映画 2024-07-24 05:01:49

我只能看到一种从通话记录中提取指定数据的方法:

创建事件列表,其中事件定义为通话开始或通话结束。 (因此每个通话记录将生成两个事件。)每个事件项应包含:系统、日期时间和布尔值开始/结束。 日期时间应向下舍入到最接近的分钟。

按(系统、日期时间)对此列表进行排序并扫描。 对于每个调用开始,将 CURCNT 加一。 对于每个调用结束,将 CURCNT 减一。

如果日期时间值与之前的记录不同,请将 CURCNT 添加到 HOURSUM。 如果日期时间值指示新小时的开始,则将 HOURSUM 除以 60,写入新的结果记录(系统、日期、小时、平均值),并将 HOURSUM 重置为零。

何时初始化CURCNT和HOURSUM,当系统值与之前的记录不同时该怎么办等等应该是显而易见的

。-Al。

I can see only one approach that extracts the data as specified from the call records:

Create a list of events, where event is defined as the beginning of a call or the end of a call. (Thus each call record will generate two events.) Each event item should contain: system, datetime, and the boolean begin/end. Datetime should be rounded down to the nearest minute.

Sort this list by (system, datetime) and scan it. For each call begin, increment CURCNT by one. For each call end, decrement CURCNT by one.

If the datetime value is different from the previous record, add CURCNT to HOURSUM. If the datetime value indicates the start of a new hour, divide HOURSUM by 60, write a new result record (system, date, hour, average), and reset HOURSUM to zero.

It should be obvious when to initialize CURCNT and HOURSUM, what to do when the system value is different from the previous record, etc.

-Al.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文