SQL查询日期时间列表的累积频率

发布于 2024-07-06 19:02:56 字数 288 浏览 13 评论 0原文

我在数据库列中有一个时间列表(代表对网站的访问)。

我需要将它们按时间间隔分组,然后获得这些日期的“累积频率”表。

例如,我可能有:

9:01
9:04
9:11
9:13
9:22
9:24
9:28

我想将其转换为

9:05 - 2
9:15 - 4
9:25 - 6
9:30 - 7

我该怎么做? 我可以用 SQL 轻松实现这一点吗? 我可以很容易地用 C# 做到这一点

I have a list of times in a database column (representing visits to a website).

I need to group them in intervals and then get a 'cumulative frequency' table of those dates.

For instance I might have:

9:01
9:04
9:11
9:13
9:22
9:24
9:28

and i want to convert that into

9:05 - 2
9:15 - 4
9:25 - 6
9:30 - 7

How can I do that? Can i even easily achieve this in SQL? I can quite easily do it in C#

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

夏花。依旧 2024-07-13 19:02:56
create table accu_times (time_val datetime not null, constraint pk_accu_times primary key (time_val));
go

insert into accu_times values ('9:01');
insert into accu_times values ('9:05');
insert into accu_times values ('9:11');
insert into accu_times values ('9:13');
insert into accu_times values ('9:22');
insert into accu_times values ('9:24');
insert into accu_times values ('9:28'); 
go

select rounded_time,
    (
    select count(*)
    from accu_times as at2
    where at2.time_val <= rt.rounded_time
    ) as accu_count
from (
select distinct
  dateadd(minute, round((datepart(minute, at.time_val) + 2)*2, -1)/2,
    dateadd(hour, datepart(hour, at.time_val), 0)
  ) as rounded_time
from accu_times as at
) as rt
go

drop table accu_times

结果是:

rounded_time            accu_count
----------------------- -----------
1900-01-01 09:05:00.000 2
1900-01-01 09:15:00.000 4
1900-01-01 09:25:00.000 6
1900-01-01 09:30:00.000 7
create table accu_times (time_val datetime not null, constraint pk_accu_times primary key (time_val));
go

insert into accu_times values ('9:01');
insert into accu_times values ('9:05');
insert into accu_times values ('9:11');
insert into accu_times values ('9:13');
insert into accu_times values ('9:22');
insert into accu_times values ('9:24');
insert into accu_times values ('9:28'); 
go

select rounded_time,
    (
    select count(*)
    from accu_times as at2
    where at2.time_val <= rt.rounded_time
    ) as accu_count
from (
select distinct
  dateadd(minute, round((datepart(minute, at.time_val) + 2)*2, -1)/2,
    dateadd(hour, datepart(hour, at.time_val), 0)
  ) as rounded_time
from accu_times as at
) as rt
go

drop table accu_times

Results in:

rounded_time            accu_count
----------------------- -----------
1900-01-01 09:05:00.000 2
1900-01-01 09:15:00.000 4
1900-01-01 09:25:00.000 6
1900-01-01 09:30:00.000 7
ぃ双果 2024-07-13 19:02:56

我应该指出,根据问题的陈述“意图”,对访客流量进行分析 - 我写了这个声明来总结统一组中的计数。

否则(如在“示例”组中)将比较 5 分钟间隔内的计数与 10 分钟间隔内的计数 - 这是没有意义的。

您必须理解用户需求的“意图”,而不是字面上的“阅读”。 :-)

    create table #myDates
       (
       myDate       datetime
       );
    go

    insert into #myDates values ('10/02/2008 09:01:23');
    insert into #myDates values ('10/02/2008 09:03:23');
    insert into #myDates values ('10/02/2008 09:05:23');
    insert into #myDates values ('10/02/2008 09:07:23');
    insert into #myDates values ('10/02/2008 09:11:23');
    insert into #myDates values ('10/02/2008 09:14:23');
    insert into #myDates values ('10/02/2008 09:19:23');
    insert into #myDates values ('10/02/2008 09:21:23');
    insert into #myDates values ('10/02/2008 09:21:23');
    insert into #myDates values ('10/02/2008 09:21:23');
    insert into #myDates values ('10/02/2008 09:21:23');
    insert into #myDates values ('10/02/2008 09:21:23');
    insert into #myDates values ('10/02/2008 09:26:23');
    insert into #myDates values ('10/02/2008 09:27:23');
    insert into #myDates values ('10/02/2008 09:29:23');
    go

    declare @interval int;
    set @interval = 10;

    select
       convert(varchar(5), dateadd(minute,@interval - datepart(minute, myDate) % @interval, myDate), 108) timeGroup,
       count(*)
    from
       #myDates
    group by
       convert(varchar(5), dateadd(minute,@interval - datepart(minute, myDate) % @interval, myDate), 108)

retuns:

timeGroup             
--------- ----------- 
09:10     4           
09:20     3           
09:30     8           

I should point out that based on the stated "intent" of the problem, to do analysis on visitor traffic - I wrote this statement to summarize the counts in uniform groups.

To do otherwise (as in the "example" groups) would be comparing the counts during a 5 minute interval to counts in a 10 minute interval - which doesn't make sense.

You have to grok to the "intent" of the user requirement, not the literal "reading" of it. :-)

    create table #myDates
       (
       myDate       datetime
       );
    go

    insert into #myDates values ('10/02/2008 09:01:23');
    insert into #myDates values ('10/02/2008 09:03:23');
    insert into #myDates values ('10/02/2008 09:05:23');
    insert into #myDates values ('10/02/2008 09:07:23');
    insert into #myDates values ('10/02/2008 09:11:23');
    insert into #myDates values ('10/02/2008 09:14:23');
    insert into #myDates values ('10/02/2008 09:19:23');
    insert into #myDates values ('10/02/2008 09:21:23');
    insert into #myDates values ('10/02/2008 09:21:23');
    insert into #myDates values ('10/02/2008 09:21:23');
    insert into #myDates values ('10/02/2008 09:21:23');
    insert into #myDates values ('10/02/2008 09:21:23');
    insert into #myDates values ('10/02/2008 09:26:23');
    insert into #myDates values ('10/02/2008 09:27:23');
    insert into #myDates values ('10/02/2008 09:29:23');
    go

    declare @interval int;
    set @interval = 10;

    select
       convert(varchar(5), dateadd(minute,@interval - datepart(minute, myDate) % @interval, myDate), 108) timeGroup,
       count(*)
    from
       #myDates
    group by
       convert(varchar(5), dateadd(minute,@interval - datepart(minute, myDate) % @interval, myDate), 108)

retuns:

timeGroup             
--------- ----------- 
09:10     4           
09:20     3           
09:30     8           
余厌 2024-07-13 19:02:56

哦,所有这些东西太复杂了。

标准化为秒,除以您的存储桶间隔,截断并重新相乘:

select sec_to_time(floor(time_to_sec(d)/300)*300), count(*)
from d
group by sec_to_time(floor(time_to_sec(d)/300)*300)

使用 Ron Savage 的数据,我知道

+----------+----------+
| i        | count(*) |
+----------+----------+
| 09:00:00 |        1 |
| 09:05:00 |        3 |
| 09:10:00 |        1 |
| 09:15:00 |        1 |
| 09:20:00 |        6 |
| 09:25:00 |        2 |
| 09:30:00 |        1 |
+----------+----------+

您可能希望使用 ceil() 或 round() 而不是 Floor()。

更新:对于使用以下命令创建的表

create table d (
    d datetime
);

ooh, way too complicated all of that stuff.

Normalise to seconds, divide by your bucket interval, truncate and remultiply:

select sec_to_time(floor(time_to_sec(d)/300)*300), count(*)
from d
group by sec_to_time(floor(time_to_sec(d)/300)*300)

Using Ron Savage's data, I get

+----------+----------+
| i        | count(*) |
+----------+----------+
| 09:00:00 |        1 |
| 09:05:00 |        3 |
| 09:10:00 |        1 |
| 09:15:00 |        1 |
| 09:20:00 |        6 |
| 09:25:00 |        2 |
| 09:30:00 |        1 |
+----------+----------+

You may wish to use ceil() or round() instead of floor().

Update: for a table created with

create table d (
    d datetime
);
尸血腥色 2024-07-13 19:02:56

创建一个表 periods 来描述您希望将一天划分为的时间段。

SELECT periods.name, count(time)
  FROM periods, times
 WHERE period.start <= times.time
   AND                 times.time < period.end
 GROUP BY periods.name

Create a table periods describing the periods you wish to divide the day up into.

SELECT periods.name, count(time)
  FROM periods, times
 WHERE period.start <= times.time
   AND                 times.time < period.end
 GROUP BY periods.name
谁许谁一生繁华 2024-07-13 19:02:56

创建一个表,其中包含您想要获取总计的时间间隔,然后将两个表连接在一起。

例如:

time_entry.time_entry
-----------------------
2008-10-02 09:01:00.000
2008-10-02 09:04:00.000
2008-10-02 09:11:00.000
2008-10-02 09:13:00.000
2008-10-02 09:22:00.000
2008-10-02 09:24:00.000
2008-10-02 09:28:00.000

time_interval.time_end
-----------------------
2008-10-02 09:05:00.000
2008-10-02 09:15:00.000
2008-10-02 09:25:00.000
2008-10-02 09:30:00.000

SELECT 
    ti.time_end, 
    COUNT(*) AS 'interval_total' 
FROM time_interval ti
INNER JOIN time_entry te
    ON te.time_entry < ti.time_end
GROUP BY ti.time_end;


time_end                interval_total
----------------------- -------------
2008-10-02 09:05:00.000 2
2008-10-02 09:15:00.000 4
2008-10-02 09:25:00.000 6
2008-10-02 09:30:00.000 7

如果您想要的不是累计总计,而是某个范围内的总计,那么您可以向 time_interval 表添加一个 time_start 列,并将查询更改为

SELECT 
    ti.time_end, 
    COUNT(*) AS 'interval_total' 
FROM time_interval ti
INNER JOIN time_entry te
    ON te.time_entry >= ti.time_start
            AND te.time_entry < ti.time_end
GROUP BY ti.time_end;

Create a table containing what intervals you want to be getting totals at then join the two tables together.

Such as:

time_entry.time_entry
-----------------------
2008-10-02 09:01:00.000
2008-10-02 09:04:00.000
2008-10-02 09:11:00.000
2008-10-02 09:13:00.000
2008-10-02 09:22:00.000
2008-10-02 09:24:00.000
2008-10-02 09:28:00.000

time_interval.time_end
-----------------------
2008-10-02 09:05:00.000
2008-10-02 09:15:00.000
2008-10-02 09:25:00.000
2008-10-02 09:30:00.000

SELECT 
    ti.time_end, 
    COUNT(*) AS 'interval_total' 
FROM time_interval ti
INNER JOIN time_entry te
    ON te.time_entry < ti.time_end
GROUP BY ti.time_end;


time_end                interval_total
----------------------- -------------
2008-10-02 09:05:00.000 2
2008-10-02 09:15:00.000 4
2008-10-02 09:25:00.000 6
2008-10-02 09:30:00.000 7

If instead of wanting cumulative totals you wanted totals within a range, then you add a time_start column to the time_interval table and change the query to

SELECT 
    ti.time_end, 
    COUNT(*) AS 'interval_total' 
FROM time_interval ti
INNER JOIN time_entry te
    ON te.time_entry >= ti.time_start
            AND te.time_entry < ti.time_end
GROUP BY ti.time_end;
各自安好 2024-07-13 19:02:56

这使用了相当多的 SQL 技巧 (SQL Server 2005):

CREATE TABLE [dbo].[stackoverflow_165571](
    [visit] [datetime] NOT NULL
) ON [PRIMARY]
GO

;WITH buckets AS (
    SELECT dateadd(mi, (1 + datediff(mi, 0, visit - 1 - dateadd(dd, 0, datediff(dd, 0, visit))) / 5) * 5, 0) AS visit_bucket
            ,COUNT(*) AS visit_count
    FROM stackoverflow_165571
    GROUP BY dateadd(mi, (1 + datediff(mi, 0, visit - 1 - dateadd(dd, 0, datediff(dd, 0, visit))) / 5) * 5, 0)
)
SELECT LEFT(CONVERT(varchar, l.visit_bucket, 8), 5) + ' - ' + CONVERT(varchar, SUM(r.visit_count))
FROM buckets l
LEFT JOIN buckets r
    ON r.visit_bucket <= l.visit_bucket
GROUP BY l.visit_bucket
ORDER BY l.visit_bucket

请注意,它将所有时间放在同一天,并假设它们位于日期时间列中。 它唯一没有像您的示例那样执行的操作是从时间表示中删除前导零。

This uses quite a few SQL tricks (SQL Server 2005):

CREATE TABLE [dbo].[stackoverflow_165571](
    [visit] [datetime] NOT NULL
) ON [PRIMARY]
GO

;WITH buckets AS (
    SELECT dateadd(mi, (1 + datediff(mi, 0, visit - 1 - dateadd(dd, 0, datediff(dd, 0, visit))) / 5) * 5, 0) AS visit_bucket
            ,COUNT(*) AS visit_count
    FROM stackoverflow_165571
    GROUP BY dateadd(mi, (1 + datediff(mi, 0, visit - 1 - dateadd(dd, 0, datediff(dd, 0, visit))) / 5) * 5, 0)
)
SELECT LEFT(CONVERT(varchar, l.visit_bucket, 8), 5) + ' - ' + CONVERT(varchar, SUM(r.visit_count))
FROM buckets l
LEFT JOIN buckets r
    ON r.visit_bucket <= l.visit_bucket
GROUP BY l.visit_bucket
ORDER BY l.visit_bucket

Note that it puts all the times on the same day, and assumes they are in a datetime column. The only thing it doesn't do as your example does is strip the leading zeroes from the time representation.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文