检测和诊断无声崩溃的工人

发布于 2024-10-08 22:01:31 字数 1502 浏览 3 评论 0原文

我正在使用守护进程运行 Celery 2 - http://ask.github.com/celery/ Cookbook/daemonizing.html 与 RabbitMQ。 有时会发生无声崩溃,我在 celeryd.log 中看到的唯一内容是:

[2010-12-24 14:14:31,323: INFO/PoolWorker-1414] process shutting down
[2010-12-24 14:14:31,323: INFO/PoolWorker-1414] process exiting with exitcode 0
[2010-12-24 14:14:31,331: INFO/PoolWorker-1415] child process calling self.run()
[2010-12-24 14:14:48,673: INFO/MainProcess] Got task from broker: airsale.search.xxx.get_search_results[01bf5d36-7c0e-4f8a-af69-750ef1b24abc]
[2010-12-24 14:14:48,761: INFO/MainProcess] Got task from broker: airsale.search.xxx.get_search_results[2d5f9952-d493-4de4-9752-0eee1776147d]
[2010-12-24 14:14:48,861: INFO/MainProcess] Got task from broker: airsale.search.xxx.get_search_results[0c77c1ec-df6c-4e34-875c-44909fbf8b9f]
[2010-12-24 14:14:48,961: INFO/MainProcess] Got task from broker: airsale.search.xxx.get_search_results[3d83dd54-0be8-4cf9-9cd6-81e070d97170]
[2010-12-24 14:14:49,061: INFO/MainProcess] Got task from broker: airsale.search.xxx.get_search_results[2dd29e70-e085-4fd1-a7ef-12d06b21644c]
..........

然后 - 只有“从代理获取任务”,没有任何任务处理。

ps -C celeryd 显示 - celery 节点正在运行。

如果我这样做: /etc/init.d/celeryd restart - celeryd 进程数量加倍。似乎旧进程不再受守护进程控制。

  1. 如何检测 - 即使从代理接收到任务,为什么未执行任务处理?
  2. 为什么旧的 celeryd 进程不会被 /etc/init.d/celeryd restart 杀死?

I'm running Celery 2 using daemonizing - http://ask.github.com/celery/cookbook/daemonizing.html with RabbitMQ.
From time to time silent crash happens, the only thing i see in celeryd.log:

[2010-12-24 14:14:31,323: INFO/PoolWorker-1414] process shutting down
[2010-12-24 14:14:31,323: INFO/PoolWorker-1414] process exiting with exitcode 0
[2010-12-24 14:14:31,331: INFO/PoolWorker-1415] child process calling self.run()
[2010-12-24 14:14:48,673: INFO/MainProcess] Got task from broker: airsale.search.xxx.get_search_results[01bf5d36-7c0e-4f8a-af69-750ef1b24abc]
[2010-12-24 14:14:48,761: INFO/MainProcess] Got task from broker: airsale.search.xxx.get_search_results[2d5f9952-d493-4de4-9752-0eee1776147d]
[2010-12-24 14:14:48,861: INFO/MainProcess] Got task from broker: airsale.search.xxx.get_search_results[0c77c1ec-df6c-4e34-875c-44909fbf8b9f]
[2010-12-24 14:14:48,961: INFO/MainProcess] Got task from broker: airsale.search.xxx.get_search_results[3d83dd54-0be8-4cf9-9cd6-81e070d97170]
[2010-12-24 14:14:49,061: INFO/MainProcess] Got task from broker: airsale.search.xxx.get_search_results[2dd29e70-e085-4fd1-a7ef-12d06b21644c]
..........

Then - only "Got task from broker" without any task processing.

ps -C celeryd shows - that celery nodes are running.

If i do : /etc/init.d/celeryd restart - number of celeryd processes doubles. Seems that old processes are uncontrolled by daemon any more.

  1. How to detect - why task processing is not performed, even if task is received from broker?
  2. Why old celeryd processes are not killed by /etc/init.d/celeryd restart?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

水波映月 2024-10-15 22:01:31

队列worker被停滞,所以解决这个问题的主要方法是调整每个任务的任务时间限制,当任务超过这个时间时重新启动worker。

到您的任务中:

from celery.decorators import task
from celery.exceptions import SoftTimeLimitExceeded


@task()
def mytask():
    try:
        do something()
    except SoftTimeLimitExceeded:
        clean something()

在 settings.py 中添加以下内容

CELERYD_TASK_TIME_LIMIT = 30 #sec
CELERYD_TASK_SOFT_TIME_LIMIT = 30 #sec

Queue workers are stalled, so the main solution to this is to adjust a the task time limit for each task, in a way to restart the worker when the tasks exceed this time.

Add to your task the following

from celery.decorators import task
from celery.exceptions import SoftTimeLimitExceeded


@task()
def mytask():
    try:
        do something()
    except SoftTimeLimitExceeded:
        clean something()

in your settings.py add the following

CELERYD_TASK_TIME_LIMIT = 30 #sec
CELERYD_TASK_SOFT_TIME_LIMIT = 30 #sec
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文