Google Cloud Composer气流SQLalchemy operationalitationalror导致DAG永远悬挂

发布于 2025-02-03 03:59:04 字数 875 浏览 5 评论 0原文

我在云作曲家气流DAG中有很多任务,其中一个是kubernetespedeperator。这个任务似乎永远陷入计划的状态>状态,因此DAG连续运行15小时而无需完成(通常需要大约一个小时)。我必须手动标记它无法结束。

我将DAG超时设置为2个小时,但没有任何区别。

Cloud Composer日志显示以下错误:

sqlalchemy.exc.OperationalError: (psycopg2.OperationalError) could not connect to server: 
Connection refused
    Is the server running on host "airflow-sqlproxy-service.default.svc.cluster.local" (10.7.124.107) 
    and accepting TCP/IP connections on port 3306?

错误日志还为我提供了有关该错误类型的文档的链接: https://docs.sqlalchemy.org/en/13/errors.html#operationalerrationerror

当下一个DAG按计划触发时,它可以正常工作,无需任何修复即可。这个问题间歇性地发生,我们无法复制它。

有人知道此错误的原因以及如何解决吗?

I have a bunch of tasks within a Cloud Composer Airflow DAG, one of which is a KubernetesPodOperator. This task seems to get stuck in the scheduled state forever and so the DAG runs continuously for 15 hours without finishing (it normally takes about an hour). I have to manually mark it failed for it to end.

I've set the DAG timeout to 2 hours but it does not make any difference.

The Cloud Composer logs show the following error:

sqlalchemy.exc.OperationalError: (psycopg2.OperationalError) could not connect to server: 
Connection refused
    Is the server running on host "airflow-sqlproxy-service.default.svc.cluster.local" (10.7.124.107) 
    and accepting TCP/IP connections on port 3306?

The error log also gives me a link to this documentation about that error type: https://docs.sqlalchemy.org/en/13/errors.html#operationalerror

When the DAG is next triggered on schedule, it works fine without any fix required. This issue happens intermittently, we've not been able to reproduce it.

Does anyone know the cause of this error and how to fix it?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

醉殇 2025-02-10 03:59:04

该问题背后的原因与 sqlalchemy 使用线程会话并创建可可的会话,该会话可以在以后的气流代码中使用。如果查询和会话之间存在一些最小延迟,则MySQL可能会关闭连接。连接超时设置为大约10分钟。

解决方案:

  • 使用 airflow.utils.db.provide_session Decorator 。这个装饰师
    在会话中为气流数据库提供了有效的会话
    参数并在功能末尾关闭会话。
  • 请勿使用单个长期运行功能。而是移动所有数据库
    查询分开函数,以便有多个功能
    使用 airflow.utils.db.provide_session装饰。在这种情况下,
    检索查询结果后会自动关闭会议。

The reason behind the issue is related to SQLAlchemy using a session by a thread and creating a callable session that can be used later in the Airflow Code. If there are some minimum delays between the queries and sessions, MySQL might close the connection. The connection timeout is set to approximately 10 minutes.

Solutions:

  • Use the airflow.utils.db.provide_session decorator. This decorator
    provides a valid session to the Airflow database in the session
    parameter and closes the session at the end of the function.
  • Do not use a single long-running function. Instead, move all database
    queries to separate functions, so that there are multiple functions
    with the airflow.utils.db.provide_session decorator. In this case,
    sessions are automatically closed after retrieving query results.
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文