查询优化-如何在此查询中实现这一点?

发布于 2025-01-09 19:49:27 字数 1823 浏览 2 评论 0原文

我该如何优化这个查询?我已经创建了索引、分区、增加了工作内存,但执行时间仍然是 35 秒。如何将其缩短至 10-15 秒? 更新:

  1. 删除了每个时间戳从 utc 到本地时间的转换,即 time_stamp AT TIME ZONE 'utc' AT TIME ZONE,这将性能提高了大约 5 秒。当前执行时间:36.5 秒。
explain analyse select
DATE_TRUNC('day', time_stamp) as "time_stamp",
COUNT(DISTINCT id) AS alarm_count,
COUNT(DISTINCT patient_id) AS patient_count
FROM
alarm_management.alarm
WHERE
tenant_name = 'abc'
and
unit = ANY('{a,b,c,d,e,f,g,h,i,j,k}'::text[])
AND
time_stamp BETWEEN '2021-09-15 02:25:00' AND '2021-12-14 04:36:45'
AND
severity_label = ANY('{a,b,c,d}'::text[])

AND derived_label IS NOT NULL
GROUP by 1

解释(分析、详细、缓冲区)输出-

GroupAggregate  (cost=3064683.77..3215681.44 rows=308821 width=24) (actual time=24242.730..35145.380 rows=91 loops=1)
  Group Key: (date_trunc('day'::text, alarm_hospitalc_burn_2021_9.time_stamp))
  ->  Sort  (cost=3064683.77..3101468.12 rows=14713740 width=40) (actual time=24167.513..25036.293 rows=16369464 loops=1)
        Sort Key: (date_trunc('day'::text, alarm_hospitalc_burn_2021_9.time_stamp))
        Sort Method: quicksort  Memory: 1672081kB
        ->  Append  (cost=0.00..1312964.42 rows=14713740 width=40) (actual time=0.308..20958.290 rows=16369464 loops=1)
              ->  Seq Scan on alarm_hospitalc_burn_2021_9  (cost=0.00..7175.10 rows=69691 width=40) (actual time=0.307..127.521 rows=94286 loops=1)
                    Filter: ((derived_label IS NOT NULL) AND (time_stamp >= '2021-09-15 02:25:00'::timestamp without time zone) AND (time_stamp <= '2021-12-14 04:36:45'::timestamp without time zone) AND (tenant_name = 'HospitalC'::text) AND (severity_label = ANY ('{"Short Yellow",Cyan,Red,Yellow}'::text[])) AND (unit = ANY ('{Burn,Delivery,EDI,EDT,EDW,ICU1,ICU2,ICU3P,ICU4P,PP,Tele}'::text[])))
              

How can I optimize this query ? I have created indexes,partitions,increased worker memory but the execution time is still 35s. How can I minimize it to 10-15 seconds?
Update :

  1. Removed conversion of every time stamp from utc to local time i.e. time_stamp AT TIME ZONE 'utc' AT TIME ZONE which improved the performance by approximately 5 seconds. Current execution time : 36.5 seconds.
explain analyse select
DATE_TRUNC('day', time_stamp) as "time_stamp",
COUNT(DISTINCT id) AS alarm_count,
COUNT(DISTINCT patient_id) AS patient_count
FROM
alarm_management.alarm
WHERE
tenant_name = 'abc'
and
unit = ANY('{a,b,c,d,e,f,g,h,i,j,k}'::text[])
AND
time_stamp BETWEEN '2021-09-15 02:25:00' AND '2021-12-14 04:36:45'
AND
severity_label = ANY('{a,b,c,d}'::text[])

AND derived_label IS NOT NULL
GROUP by 1

Explain(analyze, verbose, buffers) output-

GroupAggregate  (cost=3064683.77..3215681.44 rows=308821 width=24) (actual time=24242.730..35145.380 rows=91 loops=1)
  Group Key: (date_trunc('day'::text, alarm_hospitalc_burn_2021_9.time_stamp))
  ->  Sort  (cost=3064683.77..3101468.12 rows=14713740 width=40) (actual time=24167.513..25036.293 rows=16369464 loops=1)
        Sort Key: (date_trunc('day'::text, alarm_hospitalc_burn_2021_9.time_stamp))
        Sort Method: quicksort  Memory: 1672081kB
        ->  Append  (cost=0.00..1312964.42 rows=14713740 width=40) (actual time=0.308..20958.290 rows=16369464 loops=1)
              ->  Seq Scan on alarm_hospitalc_burn_2021_9  (cost=0.00..7175.10 rows=69691 width=40) (actual time=0.307..127.521 rows=94286 loops=1)
                    Filter: ((derived_label IS NOT NULL) AND (time_stamp >= '2021-09-15 02:25:00'::timestamp without time zone) AND (time_stamp <= '2021-12-14 04:36:45'::timestamp without time zone) AND (tenant_name = 'HospitalC'::text) AND (severity_label = ANY ('{"Short Yellow",Cyan,Red,Yellow}'::text[])) AND (unit = ANY ('{Burn,Delivery,EDI,EDT,EDW,ICU1,ICU2,ICU3P,ICU4P,PP,Tele}'::text[])))
              

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

同展鸳鸯锦 2025-01-16 19:49:27

该函数可以用 SQL 编写,这可能会稍微快一些:

CREATE OR REPLACE FUNCTION dbo.get_time_group ( _date_type TEXT ) 
RETURNS TEXT 
LANGUAGE sql -- SQL is good enough
IMMUTABLE -- better for performance, next call is faster because of caching
AS 
$
    SELECT CASE $1 
            WHEN 'hour' THEN 'hour' 
            ELSE 'day' 
        END;
$;

但最重要的是查询计划。

The function can be written in SQL, that might be slightly faster:

CREATE OR REPLACE FUNCTION dbo.get_time_group ( _date_type TEXT ) 
RETURNS TEXT 
LANGUAGE sql -- SQL is good enough
IMMUTABLE -- better for performance, next call is faster because of caching
AS 
$
    SELECT CASE $1 
            WHEN 'hour' THEN 'hour' 
            ELSE 'day' 
        END;
$;

But the most important thing is the query plan.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文