如何让 Airflow 仅在多个任务实例失败后发送电子邮件警报?
我们有一些经常运行的 DAG,即使重试也偶尔会不稳定。我们希望仅在该 DAG(或 DAG 中的任务)连续多次失败时收到警报。
即,如果 DAG 每小时运行一次,只有当它每小时连续失败 3 次时,我们才会收到来自 Airflow 的电子邮件警报。
有没有一种方法可以配置 Airflow 来做到这一点?
We have some DAG's that run often, and are occasionally flaky even with retries. We'd like to only get alerts if this DAG (or tasks within the DAG) fail multiple times in a row.
i.e., if the DAG runs every hour, we'd like to get an email alert from Airflow only if it fails each hour 3 times in a row.
Is there a way we can configure Airflow to do this?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
@Zak
有几种方法可以实现这一点。
第一个选项是以编程方式创建一个监视 try_number 的函数。然后,您可以使用 on_failure_callback 甚至 BranchPythonOperator 根据您所需的阈值
在 from airflow import DagRun,我们可以解决您的问题:
on_failure_callback
配对on_failure_callback: name_of_function_to_send_email_with_retries
from airflow.utils.email import send_email
来发送此电子邮件类似下面的内容就足够了,我自己还没有执行过。
第二个选项是利用 BaseOperator,并在默认参数中传递重试键/值对。当然,在通过路线 1 实现的流程的绝对控制与路线 2 的简易性之间存在权衡。
这两种路线都假设您已正确配置 Apache Airflow 发送电子邮件。
@Zak
There are a couple of approaches to this.
The first option is to programmatically create a function that monitors the try_number. Then you can use an on_failure_callback or perhaps even a BranchPythonOperator to send an email based on your desired threshold
Within from airflow import DagRun, we can solve your question:
on_failure_callback
inside your default parameterson_failure_callback: name_of_function_to_send_email_with_retries
from airflow.utils.email import send_email
to send this emailSomething like the below should suffice, I have not executed this myself.
The second option is to harness the BaseOperator, and inside your default parameters pass in the retry key/value pairs. Naturally, there is a tradeoff between absolute control of the process which can occur through route 1 or the ease of route 2.
Both routes assume you have correctly configured Apache Airflow to send emails.