如何将日志从远程调用,停泊的芹菜工人返回到气流容器

发布于 2025-02-05 15:44:12 字数 440 浏览 4 评论 0原文

我正在开发一个dockerized Python/Django项目,其中包括一个用于芹菜工人的容器,我一直在其中集成了现成的气流码头容器。

我的气流通过使用REDIS代理和指定的后端实例化芹菜应用程序,并通过send_task实例芹菜应用程序在预先存在的容器中成功运行芹菜任务;但是,芹菜任务进行的伐木都没有使其回到气流日志。

最初,作为我完全是气流的概念证明,我已经将其设置为运行相同的代码,通过将其暴露于气流容器并创建气流任务以在气流芹菜工人的容器上运行它。此 diD 会导致所有捕获的记录,但这绝对不是我们希望它进行架构的方式,因为这会使气流容器由于Django Project的依赖性重复而变得非常胖。

文档显示“大多数任务处理人员在完成任务后发送日志”,但我找不到更多的细节,这可能会使我知道如何在我的情况下启用同样的信息。

远程运行芹菜任务时,有什么方法可以将这些日志恢复到气流?

I am working on a Dockerized Python/Django project including a container for Celery workers, into which I have been integrating the off-the-shelf airflow docker containers.

I have Airflow successfully running celery tasks in the pre-existing container, by instantiating a Celery app with the redis broker and back end specified, and making a remote call via send_task; however none of the logging carried out by the celery task makes it back to the Airflow logs.

Initially, as a proof of concept as I am completely new to Airflow, I had set it up to run the same code by exposing it to the Airflow containers and creating airflow tasks to run it on the airflow celery worker container. This did result in all the logging being captured, but it's definitely not the way we want it architectured, as this makes the airflow containers very fat due to the repetition of dependencies from the django project.

The documentation says "Most task handlers send logs upon completion of a task" but I wasn't able to find more detail that might give me a clue how to enable the same in my situation.

Is there any way to get these logs back to airflow when running the celery tasks remotely?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

缪败 2025-02-12 15:44:12

与其“将日志返回到气流”,而是一种易于实现的替代方案(因为气流本来支持它)是激活远程日志记录。这样,所有工人的所有日志都将在S3上最终出现,并且网络服务器将自动获取它们。

以下说明了如何使用S3后端配置远程记录。其他选项(例如Google Cloud Storage,Elastic)可以类似地实现。

  1. SET remote_logging to true in airflow.cfg
  2. 构建气流连接URI。 是特别有用的IMO。最终应该有类似的东西:
aws://AKIAIOSFODNN7EXAMPLE:wJalrXUtnFEMI%2FK7MDENG%2FbPxRfiCYEXAMPLEKEY@/? 
endpoint_url=http%3A%2F%2Fs3%3A4566%2F

      nbsp; nbsp; nbsp; nbsp;如果需要,也可以通过WebServer GUI创建Connectino。

  1. 使连接URI可用于气流。这样做的一种方法是确保环境变量airflow_conn_ {your_connection_name}可用。连接名称的示例remote_logs_s3
export AIRFLOW_CONN_REMOTE_LOGS_S3=aws://AKIAIOSFODNN7EXAMPLE:wJalrXUtnFEMI%2FK7MDENG%2FbPxRfiCYEXAMPLEKEY@/?endpoint_url=http%3A%2F%2Fs3%3A4566%2F
  1. set remote_log_conn_idairflow.cfg
  2. set set set set 代码>远程_base_log_folder in airflow.cfg to所需的存储桶/前缀。示例:
remote_base_log_folder = s3://my_bucket_name/my/prefix

this so so so 在远程记录上更深入地触摸。

如果需要调试,请查看本地的任何工人日志(即工人内部)应该有所帮助。

Instead of "returning the logs to Airflow", an easy-to-implement alternative (because Airflow natively supports it) is to activate remote logging. This way, all logs from all workers would end up e.g. on S3, and the webserver would automatically fetch them.

The following illustrates how to configure remote logging using an S3 backend. Other options (e.g. Google Cloud Storage, Elastic) can be implemented similarly.

  1. Set remote_logging to True in airflow.cfg
  2. Build an Airflow connection URI. This example from the official docs is particularly useful IMO. One should end up having something like:
aws://AKIAIOSFODNN7EXAMPLE:wJalrXUtnFEMI%2FK7MDENG%2FbPxRfiCYEXAMPLEKEY@/? 
endpoint_url=http%3A%2F%2Fs3%3A4566%2F

        It is also possible to create the connectino through the webserver GUI, if needed.

  1. Make the connection URI available to Airflow. One way of doing so is to make sure that the environment variable AIRFLOW_CONN_{YOUR_CONNECTION_NAME} is available. Example for connection name REMOTE_LOGS_S3:
export AIRFLOW_CONN_REMOTE_LOGS_S3=aws://AKIAIOSFODNN7EXAMPLE:wJalrXUtnFEMI%2FK7MDENG%2FbPxRfiCYEXAMPLEKEY@/?endpoint_url=http%3A%2F%2Fs3%3A4566%2F
  1. Set remote_log_conn_id to the connection name (e.g. REMOTE_LOGS_S3) in airflow.cfg
  2. Set remote_base_log_folder in airflow.cfg to the desired bucket/prefix. Example:
remote_base_log_folder = s3://my_bucket_name/my/prefix

This related SO touches deeper on remote logging.

If debugging is needed, looking into any worker logs locally (i.e., inside the worker) should help.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文