SQL - 计算和汇总(在范围内)给定事务的实例数

发布于 2024-08-18 17:57:36 字数 1400 浏览 4 评论 0原文

我有一个 Occurrences 表,其中包含用户每次执行操作时的一行。用户每天会多次执行某个操作。它看起来像这样:

Date      Username
------    --------
1/1/9     User1
1/1/9     User1
1/1/9     User2
1/2/9     User1
1/2/9     User3
1/3/9     User1
1/3/9     User1
1/3/9     User1
1/3/9     User2
1/3/9     User3
1/4/9     User1
1/5/9     User1
1/6/9     User1
1/7/9     User1

对于该范围内的每一天,我想显示多次执行该操作的人数 - 比方说,2 到 5 次之间、6 到 10 次之间以及超过 10 次。但是,我考虑在不同的一天重复该操作以算作该操作的另一个实例。例如,如果用户在第一天执行该操作 3 次,然后在第二天再次执行任意 # 次,则我发现该用户已执行该操作 2 次,因此应该在 2 到 5 次之间柱子。

与上述示例数据对应的结果集将是:

        #_of_people       #_of_people       #_of_people       
        who_did_action    who_did_action    who_did_action        
Date    2to5_times        6to10_times       more_than_10      Total
-----   --------------    --------------    --------------    -----
1/1/9   0                 0                 0                 0
1/2/9   1                 0                 0                 1
1/3/9   3                 0                 0                 3   
...    
1/7/9   0                 1                 0                 1        

请注意,结果的每一行仅计算该特定日期的重复操作数量 - 不是累积的。

  • 1/1/9 行全为零,因为这是第一天,所有操作都被视为第一个。
  • 1/2/9 行是 1, 0, 0, 1,因为只有 User1 重复了 - 这是 User3 的第一次。
  • 1/3/9行是3,0,0,3,因为User1重复了两次,User2重复了一次,User3也重复了一次。
  • 第 1/7/9 行是 0, 1, 0, 1,因为 User1 已重复 6 次。

I've got an Occurrences table that contains one row for each time a user took an action. A user take an action multiple times per day. It looks like this:

Date      Username
------    --------
1/1/9     User1
1/1/9     User1
1/1/9     User2
1/2/9     User1
1/2/9     User3
1/3/9     User1
1/3/9     User1
1/3/9     User1
1/3/9     User2
1/3/9     User3
1/4/9     User1
1/5/9     User1
1/6/9     User1
1/7/9     User1

For each day in the range, I'd like to show the count of people who have taken the action multiple times - let's say, between 2 and 5 times, between 6 and 10 times, and more than 10 times. However, I only consider repeating the action on a different day to count as another instance of that action. For example, if a user did the thing 3 times on the first day and then again any # of times on the next day, I see that user has having done the action 2 times and hence should be in the 2-to-5 times column.

The result set corresponding to the above sample data would be:

        #_of_people       #_of_people       #_of_people       
        who_did_action    who_did_action    who_did_action        
Date    2to5_times        6to10_times       more_than_10      Total
-----   --------------    --------------    --------------    -----
1/1/9   0                 0                 0                 0
1/2/9   1                 0                 0                 1
1/3/9   3                 0                 0                 3   
...    
1/7/9   0                 1                 0                 1        

Note that each row of the result is counting the # of repeat actions for that specific day only - not cumulative.

  • The 1/1/9 row is all zeros since it's the first day and all actions are considered to be the first.
  • The 1/2/9 row is 1, 0, 0, 1 because only User1 has repeated - it's User3's first time.
  • The 1/3/9 row is 3, 0, 0, 3 because User1 has repeated twice, User2 has repeated once, and User3 has also repeated once.
  • The 1/7/9 row is 0, 1, 0, 1 because User1 has repeated 6 times.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

二手情话 2024-08-25 17:57:36

我能想到的最好的方法是,未经测试:

DECLARE @username VARCHAR(40)
DECLARE @date DATETIME
DECLARE @counter INT
    SET @counter = 1

WITH occur AS (
     SELECT DISTINCT
            CONVERT(VARCHAR(10), o.date, 101) AS dt,
            o.username
       FROM OCCURRENCES o
   ORDER BY o.username, dt),
     occur_rank AS (
     SELECT x.dt,
            @username = x.username,
            @date = CAST(x.dt AS DATETIME),
            CASE WHEN @username = x.username AND @date + 1 = x.date THEN @counter = @counter + 1 ELSE @counter = 1 END AS rank
       FROM occur x
   ORDER BY x.username, x.dt)
  SELECT or.dt,
         SUM(CASE WHEN or.rank BETWEEN 2 AND 5 THEN 1 ELSE 0 END) AS 2_to_5,
         SUM(CASE WHEN or.rank BETWEEN 6 AND 10 THEN 1 ELSE 0 END) AS 6_to_10
    FROM occur_rank or
GROUP BY or.dt

如果您可以在日期间隙上重置编号,那么剩下的事情就很容易了。但没有任何 SQL Server 排名函数支持该功能。这样就剩下手动递增变量了。

第一个 CTE occurr 仅返回与日期关联的用户名列表。第二个 CTE 以此为基础,添加了最终输出的排名。

Best I could come up with, untested:

DECLARE @username VARCHAR(40)
DECLARE @date DATETIME
DECLARE @counter INT
    SET @counter = 1

WITH occur AS (
     SELECT DISTINCT
            CONVERT(VARCHAR(10), o.date, 101) AS dt,
            o.username
       FROM OCCURRENCES o
   ORDER BY o.username, dt),
     occur_rank AS (
     SELECT x.dt,
            @username = x.username,
            @date = CAST(x.dt AS DATETIME),
            CASE WHEN @username = x.username AND @date + 1 = x.date THEN @counter = @counter + 1 ELSE @counter = 1 END AS rank
       FROM occur x
   ORDER BY x.username, x.dt)
  SELECT or.dt,
         SUM(CASE WHEN or.rank BETWEEN 2 AND 5 THEN 1 ELSE 0 END) AS 2_to_5,
         SUM(CASE WHEN or.rank BETWEEN 6 AND 10 THEN 1 ELSE 0 END) AS 6_to_10
    FROM occur_rank or
GROUP BY or.dt

If you can get the numbering to reset on a date gap, the rest is easy. But none of the SQL Server ranking functions support that. So that leaves manually incrementing a variable.

The first CTE, occur just returns a list of usernames associated with dates. The second CTE builds on it, adding the ranking for the final output to pivot on.

长不大的小祸害 2024-08-25 17:57:36

我使用带有相关子查询的派生表。

相关子查询会生成一个 ActionCount 列,该列计算直到并包括该行用户名的该行日期的不同日期的数量。

生成的派生表列出了每个日期、用户名和操作计数。

然后,外部查询按日期对结果进行分组,计算 2 到 5 之间、6 到 10 之间、大于 10 以及大于 1 的 ActionCount 数量(ActionCount 等于 1 不是“重复”) )。

select
    [Date],
    sum(case when [ActionCount] between 2 and 5 then 1 else 0 end) [#_of_people_who_did_action_2to5_times],
    sum(case when [ActionCount] between 6 and 10 then 1 else 0 end) [#_of_people_who_did_action_6to10_times],
    sum(case when [ActionCount] > 10 then 1 else 0 end) [#_of_people_who_did_action_more_than_10],
    sum(case when [ActionCount] > 1 then 1 else 0 end) [Total]
from (
    select distinct
        dt.[Date],
        dt.[Username],
        (select count(distinct [Date]) from OCCURRENCES cd where cd.[Username] = dt.[Username] and cd.[Date] <= dt.[Date]) [ActionCount]
    from OCCURRENCES dt
) a
group by [Date]

I used a derived table with a correlated subquery.

The correlated subquery results in a column ActionCount that counts the number of distinct dates up to and including that row's Date for that row's Username.

The resulting derived table lists each Date, Username, and ActionCount.

The outer query then groups the results by Date, counting the number of ActionCounts between 2 and 5, between 6 and 10, greater than 10, and, for the Total, greater than 1 (an ActionCount equal to 1 is not a 'repeat').

select
    [Date],
    sum(case when [ActionCount] between 2 and 5 then 1 else 0 end) [#_of_people_who_did_action_2to5_times],
    sum(case when [ActionCount] between 6 and 10 then 1 else 0 end) [#_of_people_who_did_action_6to10_times],
    sum(case when [ActionCount] > 10 then 1 else 0 end) [#_of_people_who_did_action_more_than_10],
    sum(case when [ActionCount] > 1 then 1 else 0 end) [Total]
from (
    select distinct
        dt.[Date],
        dt.[Username],
        (select count(distinct [Date]) from OCCURRENCES cd where cd.[Username] = dt.[Username] and cd.[Date] <= dt.[Date]) [ActionCount]
    from OCCURRENCES dt
) a
group by [Date]
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文