查询优化-如何在此查询中实现这一点？

发布于 2025-01-09 19:49:27 字数 1823 浏览 2 评论 0原文

我该如何优化这个查询？我已经创建了索引、分区、增加了工作内存，但执行时间仍然是 35 秒。如何将其缩短至 10-15 秒？更新：

删除了每个时间戳从 utc 到本地时间的转换，即 time_stamp AT TIME ZONE 'utc' AT TIME ZONE，这将性能提高了大约 5 秒。当前执行时间：36.5 秒。

explain analyse select
DATE_TRUNC('day', time_stamp) as "time_stamp",
COUNT(DISTINCT id) AS alarm_count,
COUNT(DISTINCT patient_id) AS patient_count
FROM
alarm_management.alarm
WHERE
tenant_name = 'abc'
and
unit = ANY('{a,b,c,d,e,f,g,h,i,j,k}'::text[])
AND
time_stamp BETWEEN '2021-09-15 02:25:00' AND '2021-12-14 04:36:45'
AND
severity_label = ANY('{a,b,c,d}'::text[])

AND derived_label IS NOT NULL
GROUP by 1

解释（分析、详细、缓冲区）输出-

GroupAggregate  (cost=3064683.77..3215681.44 rows=308821 width=24) (actual time=24242.730..35145.380 rows=91 loops=1)
  Group Key: (date_trunc('day'::text, alarm_hospitalc_burn_2021_9.time_stamp))
  ->  Sort  (cost=3064683.77..3101468.12 rows=14713740 width=40) (actual time=24167.513..25036.293 rows=16369464 loops=1)
        Sort Key: (date_trunc('day'::text, alarm_hospitalc_burn_2021_9.time_stamp))
        Sort Method: quicksort  Memory: 1672081kB
        ->  Append  (cost=0.00..1312964.42 rows=14713740 width=40) (actual time=0.308..20958.290 rows=16369464 loops=1)
              ->  Seq Scan on alarm_hospitalc_burn_2021_9  (cost=0.00..7175.10 rows=69691 width=40) (actual time=0.307..127.521 rows=94286 loops=1)
                    Filter: ((derived_label IS NOT NULL) AND (time_stamp >= '2021-09-15 02:25:00'::timestamp without time zone) AND (time_stamp <= '2021-12-14 04:36:45'::timestamp without time zone) AND (tenant_name = 'HospitalC'::text) AND (severity_label = ANY ('{"Short Yellow",Cyan,Red,Yellow}'::text[])) AND (unit = ANY ('{Burn,Delivery,EDI,EDT,EDW,ICU1,ICU2,ICU3P,ICU4P,PP,Tele}'::text[])))

原文

How can I optimize this query ? I have created indexes,partitions,increased worker memory but the execution time is still 35s. How can I minimize it to 10-15 seconds?
Update :

Removed conversion of every time stamp from utc to local time i.e. time_stamp AT TIME ZONE 'utc' AT TIME ZONE which improved the performance by approximately 5 seconds. Current execution time : 36.5 seconds.

explain analyse select
DATE_TRUNC('day', time_stamp) as "time_stamp",
COUNT(DISTINCT id) AS alarm_count,
COUNT(DISTINCT patient_id) AS patient_count
FROM
alarm_management.alarm
WHERE
tenant_name = 'abc'
and
unit = ANY('{a,b,c,d,e,f,g,h,i,j,k}'::text[])
AND
time_stamp BETWEEN '2021-09-15 02:25:00' AND '2021-12-14 04:36:45'
AND
severity_label = ANY('{a,b,c,d}'::text[])

AND derived_label IS NOT NULL
GROUP by 1

Explain(analyze, verbose, buffers) output-

GroupAggregate  (cost=3064683.77..3215681.44 rows=308821 width=24) (actual time=24242.730..35145.380 rows=91 loops=1)
  Group Key: (date_trunc('day'::text, alarm_hospitalc_burn_2021_9.time_stamp))
  ->  Sort  (cost=3064683.77..3101468.12 rows=14713740 width=40) (actual time=24167.513..25036.293 rows=16369464 loops=1)
        Sort Key: (date_trunc('day'::text, alarm_hospitalc_burn_2021_9.time_stamp))
        Sort Method: quicksort  Memory: 1672081kB
        ->  Append  (cost=0.00..1312964.42 rows=14713740 width=40) (actual time=0.308..20958.290 rows=16369464 loops=1)
              ->  Seq Scan on alarm_hospitalc_burn_2021_9  (cost=0.00..7175.10 rows=69691 width=40) (actual time=0.307..127.521 rows=94286 loops=1)
                    Filter: ((derived_label IS NOT NULL) AND (time_stamp >= '2021-09-15 02:25:00'::timestamp without time zone) AND (time_stamp <= '2021-12-14 04:36:45'::timestamp without time zone) AND (tenant_name = 'HospitalC'::text) AND (severity_label = ANY ('{"Short Yellow",Cyan,Red,Yellow}'::text[])) AND (unit = ANY ('{Burn,Delivery,EDI,EDT,EDW,ICU1,ICU2,ICU3P,ICU4P,PP,Tele}'::text[])))

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

同展鸳鸯锦 2025-01-16 19:49:27

该函数可以用 SQL 编写，这可能会稍微快一些：

CREATE OR REPLACE FUNCTION dbo.get_time_group ( _date_type TEXT ) 
RETURNS TEXT 
LANGUAGE sql -- SQL is good enough
IMMUTABLE -- better for performance, next call is faster because of caching
AS 
$
    SELECT CASE $1 
            WHEN 'hour' THEN 'hour' 
            ELSE 'day' 
        END;
$;

但最重要的是查询计划。

The function can be written in SQL, that might be slightly faster:

CREATE OR REPLACE FUNCTION dbo.get_time_group ( _date_type TEXT ) 
RETURNS TEXT 
LANGUAGE sql -- SQL is good enough
IMMUTABLE -- better for performance, next call is faster because of caching
AS 
$
    SELECT CASE $1 
            WHEN 'hour' THEN 'hour' 
            ELSE 'day' 
        END;
$;

But the most important thing is the query plan.

回复收藏 0 原文

~没有更多了~