触发函数刷新Postgres物化视图并捕获刷新结束时间?
我正在尝试在 Postgres 数据库中构建一个汇总表,其中包含有关何时刷新各种物化视图的信息。我还希望该表能够触发实际的刷新。
该表所需的格式如下,将其命名为 mv_refresh_monitor
:
view_name | refresh_time_startfresh_time_end | 2022-02-01 22 |
---|---|---|
view_one | 2022-02-01 22:10:59.234567 | :11:59.234567 |
该表显示 view_one
最后一次刷新是在深夜2月1日,刷新用时1分钟完成。
我想做的是通过更新 refresh_time_start
字段来触发物化刷新;这样做会触发 view_name
字段中的物化视图刷新,然后还会更新同一行的 refresh_time_end
字段以捕获刷新完成的时间。
我当前的实现使用一个函数(更新监视器表)、一个触发器函数(刷新视图并调用函数)以及监视器表上的触发器来调用触发器函数。这仅适用于单个物化视图:
CREATE OR REPLACE FUNCTION "ingested_digital_spend"."timestamp_refresh_end"()
RETURNS "pg_catalog"."void" AS $BODY$
UPDATE schema.mv_refresh_monitor SET refresh_time_end = CURRENT_TIMESTAMP AT TIME ZONE 'America/Los_Angeles';
$BODY$
LANGUAGE SQL;
CREATE OR REPLACE FUNCTION "schema"."refresh_materialized_view"()
RETURNS "pg_catalog"."trigger" AS $BODY$
BEGIN
IF NEW.view_name = 'view_one' AND NEW.refresh_time_start IS DISTINCT FROM OLD.refresh_time_start THEN
REFRESH MATERIALIZED VIEW schema.view_one;
PERFORM timestamp_refresh_end();
END IF;
RETURN NULL;
END;
$BODY$
LANGUAGE plpgsql;
CREATE TRIGGER "refresh_mv" AFTER UPDATE ON "schema"."mv_refresh_monitor"
FOR EACH ROW
EXECUTE PROCEDURE "schema"."refresh_materialized_view"();
这几乎可行,但我正在尝试改进/修复三件事:
- 使其参数化,以便我不必编写新的
IF THEN END IF
每次我向架构添加新的物化视图时,都会在触发器函数中添加子句。- 这看起来应该不是太难,我只是还没有找到参数化 PL/pgSQL 函数的正确方法。
- 目前,尽管刷新操作本身需要 80 秒,但
refresh_time_start
和refresh_time_end
中记录的时间是相同的。我不确定如何确定 CURRENT_TIMESTAMP 操作的范围,以便它们在最初调用触发器函数时不会评估为相同的时间戳。- 这感觉应该是可能的,但我对此不太确定。
- 如果可能的话,我希望实际刷新发生在“后台”。也就是说,更新立即完成并释放正在执行更新的会话。现在,监视表上的事务只有在视图 REFRESH 事务本身完成后才能完成,因此客户端会话会挂起,直到刷新完成。
- 这可能是不可能的。
有什么建议或解决方案可以更接近这三个要求?
I am trying to build a summary table in our Postgres database that contains information about when various materialized views were refreshed. I would also like this table to trigger the actual refreshes.
The desired format for the table is as below, call it mv_refresh_monitor
:
view_name | refresh_time_start | refresh_time_end |
---|---|---|
view_one | 2022-02-01 22:10:59.234567 | 2022-02-01 22:11:59.234567 |
The table shows view_one
was last refreshed late at night February 1st, and the refresh took 1 minute to complete.
What I would like to do is trigger the materialized refreshes by updating the refresh_time_start
field; doing so would trigger the materialized view in the view_name
field to refresh, and then also update the same row's refresh_time_end
field to capture the time when the refresh is done.
My current implementation uses a function (to update the monitor table), a trigger function (to both refresh the view and call the function), and a trigger on the monitor table to call the trigger function. This is scoped only for a single materialized view:
CREATE OR REPLACE FUNCTION "ingested_digital_spend"."timestamp_refresh_end"()
RETURNS "pg_catalog"."void" AS $BODY$
UPDATE schema.mv_refresh_monitor SET refresh_time_end = CURRENT_TIMESTAMP AT TIME ZONE 'America/Los_Angeles';
$BODY$
LANGUAGE SQL;
CREATE OR REPLACE FUNCTION "schema"."refresh_materialized_view"()
RETURNS "pg_catalog"."trigger" AS $BODY$
BEGIN
IF NEW.view_name = 'view_one' AND NEW.refresh_time_start IS DISTINCT FROM OLD.refresh_time_start THEN
REFRESH MATERIALIZED VIEW schema.view_one;
PERFORM timestamp_refresh_end();
END IF;
RETURN NULL;
END;
$BODY$
LANGUAGE plpgsql;
CREATE TRIGGER "refresh_mv" AFTER UPDATE ON "schema"."mv_refresh_monitor"
FOR EACH ROW
EXECUTE PROCEDURE "schema"."refresh_materialized_view"();
This almost works, but I'm trying to improve / fix three things:
- Make this parameterized so that I don't have to write new
IF THEN END IF
clauses in the trigger function each time I add a new materialized view to the schema.- This seems like it shouldn't be terribly hard, I just haven't figured out the right way to parameterize PL/pgSQL functions yet.
- Currently the time recorded in
refresh_time_start
andrefresh_time_end
are identical, despite the refresh operation itself taking 80 seconds. I am not sure how to scope theCURRENT_TIMESTAMP
operations so that they don't evaluate to the same timestamp when the trigger function is initially called.- This feels like it should be possible, but I'm not as sure of this.
- I would like the actual refresh to happen "in the background" if possible. That is, for the UPDATE to complete immediately and release the session that is performing the UPDATE. Right now the transaction on the monitor table doesn't complete until the view REFRESH transaction itself completes, so the client session hangs until the refresh completes.
- This might be impossible.
Any suggestions or solutions for getting closer to these three requirements?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我认为你通过在触发器/函数中投入太多来解决这个问题的方式是错误的。我会选择:
您向其提供视图名称和开始名称的函数。它执行
REFRESH MATERIALIZED VIEW some_view>
并用信息更新mv_refresh_monitor
。有关如何参数化的信息,请参阅动态查询对于
CURRENT_TIMESTAMP
问题,请参阅当前日期/时间。CURRENT_TIMESTAMP
按设计捕获事务开始时的时间戳,并且在事务中不会更改。您正在寻找transaction_timestamp()/statement_timestamp()
。如果您不将
REFRESH MATERIALIZED VIEW
与UPDATE
绑定,则可以消除此问题。I think you are going about this the wrong way by pushing to much into the trigger/function. I would go with:
A function that you provide the view name and start name to. It does the
REFRESH MATERIALIZED VIEW some_view>
and updatesmv_refresh_monitor
with information. For information on how to parametrize this see Dynamic QueriesFor the
CURRENT_TIMESTAMP
issue see Current date/time.CURRENT_TIMESTAMP
by design captures the timestamp at the start of a transaction and does not change in the transaction. You are looking fortransaction_timestamp()/statement_timestamp()
.If you don't tie the
REFRESH MATERIALIZED VIEW
to theUPDATE
you eliminate this issue.