触发函数刷新Postgres物化视图并捕获刷新结束时间？

发布于 2025-01-10 10:43:42 字数 2026 浏览 0 评论 0原文

我正在尝试在 Postgres 数据库中构建一个汇总表，其中包含有关何时刷新各种物化视图的信息。我还希望该表能够触发实际的刷新。

该表所需的格式如下，将其命名为 mv_refresh_monitor：

view_name	refresh_time_startfresh_time_end	2022-02-01 22
view_one	2022-02-01 22:10:59.234567	:11:59.234567

该表显示 view_one 最后一次刷新是在深夜2月1日，刷新用时1分钟完成。

我想做的是通过更新 refresh_time_start 字段来触发物化刷新；这样做会触发 view_name 字段中的物化视图刷新，然后还会更新同一行的 refresh_time_end 字段以捕获刷新完成的时间。

我当前的实现使用一个函数（更新监视器表）、一个触发器函数（刷新视图并调用函数）以及监视器表上的触发器来调用触发器函数。这仅适用于单个物化视图：

CREATE OR REPLACE FUNCTION "ingested_digital_spend"."timestamp_refresh_end"()
  RETURNS "pg_catalog"."void" AS $BODY$
        UPDATE schema.mv_refresh_monitor SET refresh_time_end = CURRENT_TIMESTAMP AT TIME ZONE 'America/Los_Angeles';
$BODY$
  LANGUAGE SQL;

CREATE OR REPLACE FUNCTION "schema"."refresh_materialized_view"()
  RETURNS "pg_catalog"."trigger" AS $BODY$
BEGIN
IF NEW.view_name = 'view_one' AND NEW.refresh_time_start IS DISTINCT FROM OLD.refresh_time_start THEN
REFRESH MATERIALIZED VIEW schema.view_one;
        PERFORM timestamp_refresh_end();
END IF;
RETURN NULL;
END;
$BODY$
  LANGUAGE plpgsql;

CREATE TRIGGER "refresh_mv" AFTER UPDATE ON "schema"."mv_refresh_monitor"
FOR EACH ROW
EXECUTE PROCEDURE "schema"."refresh_materialized_view"();

这几乎可行，但我正在尝试改进/修复三件事：

使其参数化，以便我不必编写新的 IF THEN END IF每次我向架构添加新的物化视图时，都会在触发器函数中添加子句。
- 这看起来应该不是太难，我只是还没有找到参数化 PL/pgSQL 函数的正确方法。
目前，尽管刷新操作本身需要 80 秒，但 refresh_time_start 和 refresh_time_end 中记录的时间是相同的。我不确定如何确定 CURRENT_TIMESTAMP 操作的范围，以便它们在最初调用触发器函数时不会评估为相同的时间戳。
- 这感觉应该是可能的，但我对此不太确定。
如果可能的话，我希望实际刷新发生在“后台”。也就是说，更新立即完成并释放正在执行更新的会话。现在，监视表上的事务只有在视图 REFRESH 事务本身完成后才能完成，因此客户端会话会挂起，直到刷新完成。
- 这可能是不可能的。

有什么建议或解决方案可以更接近这三个要求？

原文

I am trying to build a summary table in our Postgres database that contains information about when various materialized views were refreshed. I would also like this table to trigger the actual refreshes.

The desired format for the table is as below, call it mv_refresh_monitor:

view_name	refresh_time_start	refresh_time_end
view_one	2022-02-01 22:10:59.234567	2022-02-01 22:11:59.234567

The table shows view_one was last refreshed late at night February 1st, and the refresh took 1 minute to complete.

What I would like to do is trigger the materialized refreshes by updating the refresh_time_start field; doing so would trigger the materialized view in the view_name field to refresh, and then also update the same row's refresh_time_end field to capture the time when the refresh is done.

My current implementation uses a function (to update the monitor table), a trigger function (to both refresh the view and call the function), and a trigger on the monitor table to call the trigger function. This is scoped only for a single materialized view:

CREATE OR REPLACE FUNCTION "ingested_digital_spend"."timestamp_refresh_end"()
  RETURNS "pg_catalog"."void" AS $BODY$
        UPDATE schema.mv_refresh_monitor SET refresh_time_end = CURRENT_TIMESTAMP AT TIME ZONE 'America/Los_Angeles';
$BODY$
  LANGUAGE SQL;

CREATE OR REPLACE FUNCTION "schema"."refresh_materialized_view"()
  RETURNS "pg_catalog"."trigger" AS $BODY$
BEGIN
IF NEW.view_name = 'view_one' AND NEW.refresh_time_start IS DISTINCT FROM OLD.refresh_time_start THEN
REFRESH MATERIALIZED VIEW schema.view_one;
        PERFORM timestamp_refresh_end();
END IF;
RETURN NULL;
END;
$BODY$
  LANGUAGE plpgsql;

CREATE TRIGGER "refresh_mv" AFTER UPDATE ON "schema"."mv_refresh_monitor"
FOR EACH ROW
EXECUTE PROCEDURE "schema"."refresh_materialized_view"();

This almost works, but I'm trying to improve / fix three things:

Make this parameterized so that I don't have to write new IF THEN END IF clauses in the trigger function each time I add a new materialized view to the schema.
- This seems like it shouldn't be terribly hard, I just haven't figured out the right way to parameterize PL/pgSQL functions yet.
Currently the time recorded in refresh_time_start and refresh_time_end are identical, despite the refresh operation itself taking 80 seconds. I am not sure how to scope the CURRENT_TIMESTAMP operations so that they don't evaluate to the same timestamp when the trigger function is initially called.
- This feels like it should be possible, but I'm not as sure of this.
I would like the actual refresh to happen "in the background" if possible. That is, for the UPDATE to complete immediately and release the session that is performing the UPDATE. Right now the transaction on the monitor table doesn't complete until the view REFRESH transaction itself completes, so the client session hangs until the refresh completes.
- This might be impossible.

Any suggestions or solutions for getting closer to these three requirements?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

云雾 2025-01-17 10:43:42

我认为你通过在触发器/函数中投入太多来解决这个问题的方式是错误的。我会选择：

您向其提供视图名称和开始名称的函数。它执行REFRESH MATERIALIZED VIEW some_view>并用信息更新mv_refresh_monitor。有关如何参数化的信息，请参阅动态查询
对于 CURRENT_TIMESTAMP 问题，请参阅当前日期/时间。 CURRENT_TIMESTAMP 按设计捕获事务开始时的时间戳，并且在事务中不会更改。您正在寻找 transaction_timestamp()/statement_timestamp()。
如果您不将 REFRESH MATERIALIZED VIEW 与 UPDATE 绑定，则可以消除此问题。