SQL高级分组&案例语句,这可能吗?

发布于 2025-02-09 08:50:59 字数 2461 浏览 2 评论 0原文

我有以下方案

系统子系统&文件名文件加载启动时间文件加载终止时间
alphaa1 trassactionTXT2022-06-19 08:00:002022-06-19 08:00:02
AlphaA2 Alpha A2 usercsv2022-06-1908:00:00: 02 2022-2022-06-19 08:00:05
AlphaA2员工CSV 2022-06-19 08:00:052022-06-19 08:00:08
AlphaA1 ManagersCSV2022-06-19 08:00:0820222-06-06-06-19 08:00: 16
AlphaA3客户CSV 2022-06-19 08:00:012022-06-19 08:00:04
GAMMAA1 TRASSACTIONTXT2022-06-19 10:00:482022-06-06-19
00:00:00: 00310 :2022-06-19 10:00:532022-06-19 10:00:54
GAMMAA2 Emplayscsv2022-06-19 10:00:272022-06-19 10:00:00:30
GammaA1 ManagersCSV2022-06--06--06--06-- 19 10:00:112022-06-19 10:00:17
GammaA3 Customerscsv2022-06-19 10:00:00: 13 2022-06-19 10:00:14

我想能够通过系统。所需的信息是何时开始(最早的时间),何时结束(最新时间)以及每个子系统在几秒钟内发生的时间。从上面的示例中,结果应如下:

系统整体系统负载启动时间总体系统负载终止时间a1花费A2时间A3时间为
Alpha2022-06-19 08:00:00:002022-06-19 08: 00:1600:00:1000:00:0600:00:03
伽马2022-06-19:11 2022-06-1910:00 :10:00:5400 :00:01

我找不到在查询中做到这一点的方法,我试图在每列的选择子句中选择选项,而仅按系统进行末端组。但这是不可能的,因为我必须使用一个汇总函数,而在“选择”条款中,

我的方法

SELECT System, 
min(StartTime) as 'File Load Start Time',
max(EndTime) as 'File Load End Time', 
CASE WHEN SubSystem LIKE 'A1%' THEN SUM(DATEDIFF(s, min(StartTime), max(EndTime))) Else 0 END AS 'A1 Time Taken',
CASE WHEN SubSystem LIKE 'A2%' THEN SUM(DATEDIFF(s, min(StartTime), max(EndTime))) Else 0 END AS 'A2 Time Taken',
CASE WHEN SubSystem LIKE 'A3%' THEN SUM(DATEDIFF(s, min(StartTime), max(EndTime))) Else 0 END AS 'A3 Time Taken'
FROM TABLE GROUP BY SYSTEM

在我的方法中不支持案例语句,但这是不起作用的条款也是我无法汇总的

I have the following scenario

SystemSubsystem & FilenameFile Load Start TimeFile Load End Time
AlphaA1 transactiontxt2022-06-19 08:00:002022-06-19 08:00:02
AlphaA2 userscsv2022-06-19 08:00:022022-06-19 08:00:05
AlphaA2 employeescsv2022-06-19 08:00:052022-06-19 08:00:08
AlphaA1 managerscsv2022-06-19 08:00:082022-06-19 08:00:16
AlphaA3 customerscsv2022-06-19 08:00:012022-06-19 08:00:04
GammaA1 transactiontxt2022-06-19 10:00:482022-06-19 10:00:53
GammaA2 userscsv2022-06-19 10:00:532022-06-19 10:00:54
GammaA2 employeescsv2022-06-19 10:00:272022-06-19 10:00:30
GammaA1 managerscsv2022-06-19 10:00:112022-06-19 10:00:17
GammaA3 customerscsv2022-06-19 10:00:132022-06-19 10:00:14

I want to be able to group the summary statistics by System. The info needed is when the overall started (earliest time), when it ended (latest time), and the time it took for each subsystem to occur, in seconds. From the example above, the result should look as below:

SystemOverall System Load Start TimeOverall System Load End TimeA1 Time TakenA2 Time TakenA3 Time Taken
Alpha2022-06-19 08:00:002022-06-19 08:00:1600:00:1000:00:0600:00:03
Gamma2022-06-19 10:00:112022-06-19 10:00:5400:00:1100:00:0400:00:01

I cannot find a way to do this in a query, I'm trying to do select subqueries in the select clause for each column, and at the end group by only System. But this is not possible because I'd have to use an aggregate function which is not being supported with case statements in subqueries in the select clause

My approach was something like

SELECT System, 
min(StartTime) as 'File Load Start Time',
max(EndTime) as 'File Load End Time', 
CASE WHEN SubSystem LIKE 'A1%' THEN SUM(DATEDIFF(s, min(StartTime), max(EndTime))) Else 0 END AS 'A1 Time Taken',
CASE WHEN SubSystem LIKE 'A2%' THEN SUM(DATEDIFF(s, min(StartTime), max(EndTime))) Else 0 END AS 'A2 Time Taken',
CASE WHEN SubSystem LIKE 'A3%' THEN SUM(DATEDIFF(s, min(StartTime), max(EndTime))) Else 0 END AS 'A3 Time Taken'
FROM TABLE GROUP BY SYSTEM

But this does not work because the case statements need to be in a group by clause as well, and I cannot aggregate them

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

罗罗贝儿 2025-02-16 08:50:59

您的原始查询只需要进行一些调整即可正常工作。我在工作台上对其进行了测试。现在起作用。

SELECT System, 
    min(StartTime) as 'File Load Start Time',
    max(EndTime) as 'File Load End Time', 
    sec_to_time(sum( case when substring_index(subsystem,' ',1)='a1' then to_seconds(endtime)-to_seconds(starttime) else 0 end 
    ) )as 'A1 Time Taken',
    sec_to_time(sum( case when substring_index(subsystem,' ',1)='a2' then to_seconds(endtime)-to_seconds(starttime) else 0 end 
    ) )as 'A2 Time Taken',
    sec_to_time(sum( case when substring_index(subsystem,' ',1)='a3' then to_seconds(endtime)-to_seconds(starttime) else 0 end 
    ) )as 'A3 Time Taken'
FROM test  GROUP BY SYSTEM
;

Your original query just needs a little bit of tweaking in order to work properly. I tested it in workbench. It works now.

SELECT System, 
    min(StartTime) as 'File Load Start Time',
    max(EndTime) as 'File Load End Time', 
    sec_to_time(sum( case when substring_index(subsystem,' ',1)='a1' then to_seconds(endtime)-to_seconds(starttime) else 0 end 
    ) )as 'A1 Time Taken',
    sec_to_time(sum( case when substring_index(subsystem,' ',1)='a2' then to_seconds(endtime)-to_seconds(starttime) else 0 end 
    ) )as 'A2 Time Taken',
    sec_to_time(sum( case when substring_index(subsystem,' ',1)='a3' then to_seconds(endtime)-to_seconds(starttime) else 0 end 
    ) )as 'A3 Time Taken'
FROM test  GROUP BY SYSTEM
;
治碍 2025-02-16 08:50:59

我假设您的子系统是有限的,不需要动态列(枢轴/crosstab)。

Folloing查询应为您提供所需的输出。

SELECT
    system_name AS "System", 
    min(file_load_start_time) AS "Overall System Load Start Time", 
    max(file_load_end_time) AS "Overall System Load End Time",
    (
    SELECT
        sec_to_time(sum(timestampdiff(SECOND, tsl_a1.file_load_start_time, tsl_a1.file_load_end_time)))
    FROM
        t_system_log tsl_a1
    WHERE
        tsl_a1.system_name = tsl.system_name
        AND LEFT(tsl_a1.subsystem_filename,
        2) = 'A1') AS "A1 Time Taken",
    (
    SELECT
        sec_to_time(sum(timestampdiff(SECOND, tsl_a2.file_load_start_time, tsl_a2.file_load_end_time)))
    FROM
        t_system_log tsl_a2
    WHERE
        tsl_a2.system_name = tsl.system_name
        AND LEFT(tsl_a2.subsystem_filename,
        2) = 'A2') AS "A2 Time Taken",
    (
    SELECT
        sec_to_time(sum(timestampdiff(SECOND, tsl_a3.file_load_start_time, tsl_a3.file_load_end_time)))
    FROM
        t_system_log tsl_a3
    WHERE
        tsl_a3.system_name = tsl.system_name
        AND LEFT(tsl_a3.subsystem_filename,
        2) = 'A3') AS "A3 Time Taken"
FROM
    t_system_log tsl
GROUP BY
    system_name;

如果您的子系统是动态的,则应使用枢轴查询而不是子查询。子查询可能会影响性能。


将所有子征服转换为内联,以提高性能。
还更新了我的小提琴。

SELECT
    system_name AS "System", 
    min(file_load_start_time) AS "Overall System Load Start Time", 
    max(file_load_end_time) AS "Overall System Load End Time",
    sec_to_time(sum(case when  LEFT(tsl.subsystem_filename, 2) = 'A1' then timestampdiff(SECOND, tsl.file_load_start_time, tsl.file_load_end_time) else 0 end)) as "A1 Time Taken",
    sec_to_time(sum(case when  LEFT(tsl.subsystem_filename, 2) = 'A2' then timestampdiff(SECOND, tsl.file_load_start_time, tsl.file_load_end_time) else 0 end)) as "A2 Time Taken",
    sec_to_time(sum(case when  LEFT(tsl.subsystem_filename, 2) = 'A3' then timestampdiff(SECOND, tsl.file_load_start_time, tsl.file_load_end_time) else 0 end)) as "A3 Time Taken"
FROM
    t_system_log tsl
GROUP BY
    system_name;

I am assuming your subsystems are finite and doesn't require dynamic columns (pivot/crosstab).

Folloing query should give you the desired output.

SELECT
    system_name AS "System", 
    min(file_load_start_time) AS "Overall System Load Start Time", 
    max(file_load_end_time) AS "Overall System Load End Time",
    (
    SELECT
        sec_to_time(sum(timestampdiff(SECOND, tsl_a1.file_load_start_time, tsl_a1.file_load_end_time)))
    FROM
        t_system_log tsl_a1
    WHERE
        tsl_a1.system_name = tsl.system_name
        AND LEFT(tsl_a1.subsystem_filename,
        2) = 'A1') AS "A1 Time Taken",
    (
    SELECT
        sec_to_time(sum(timestampdiff(SECOND, tsl_a2.file_load_start_time, tsl_a2.file_load_end_time)))
    FROM
        t_system_log tsl_a2
    WHERE
        tsl_a2.system_name = tsl.system_name
        AND LEFT(tsl_a2.subsystem_filename,
        2) = 'A2') AS "A2 Time Taken",
    (
    SELECT
        sec_to_time(sum(timestampdiff(SECOND, tsl_a3.file_load_start_time, tsl_a3.file_load_end_time)))
    FROM
        t_system_log tsl_a3
    WHERE
        tsl_a3.system_name = tsl.system_name
        AND LEFT(tsl_a3.subsystem_filename,
        2) = 'A3') AS "A3 Time Taken"
FROM
    t_system_log tsl
GROUP BY
    system_name;

In case, your subsystems are dynamic, you should use pivot query instead of subquery. Also subquery might impact on the performance.

My Fiddle

EDIT:
Converting all the sub-queries to inline to improve performance.
Updated my Fiddle also.

Updated Fiddle

SELECT
    system_name AS "System", 
    min(file_load_start_time) AS "Overall System Load Start Time", 
    max(file_load_end_time) AS "Overall System Load End Time",
    sec_to_time(sum(case when  LEFT(tsl.subsystem_filename, 2) = 'A1' then timestampdiff(SECOND, tsl.file_load_start_time, tsl.file_load_end_time) else 0 end)) as "A1 Time Taken",
    sec_to_time(sum(case when  LEFT(tsl.subsystem_filename, 2) = 'A2' then timestampdiff(SECOND, tsl.file_load_start_time, tsl.file_load_end_time) else 0 end)) as "A2 Time Taken",
    sec_to_time(sum(case when  LEFT(tsl.subsystem_filename, 2) = 'A3' then timestampdiff(SECOND, tsl.file_load_start_time, tsl.file_load_end_time) else 0 end)) as "A3 Time Taken"
FROM
    t_system_log tsl
GROUP BY
    system_name;
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文