celery 守护进程的问题

发布于 2024-11-17 21:13:34 字数 565 浏览 9 评论 0原文

我们的 celery 守护进程非常不稳定，存在问题。每当我们推送更改时，我们都会使用结构部署脚本来重新启动守护进程，但由于某种原因，这会导致大量问题。

每当运行部署脚本时，celery 进程都会处于某种伪死亡状态。他们（不幸的是）仍然会消耗来自rabbitmq的任务，但他们实际上不会做任何事情。令人困惑的是，简单的检查表明在这种状态下一切似乎都“很好”，celeryctl status 显示一个节点在线，并且 ps aux | grep celery 显示 2 个正在运行的进程。

但是，尝试手动运行 /etc/init.d/celeryd stop 会导致以下错误：

start-stop-daemon: warning: failed to kill 30360: No such process

虽然在此状态下尝试运行 celeryd start 似乎工作正常，但实际上什么也没做。解决该问题的唯一方法是手动终止正在运行的 celery 进程，然后重新启动它们。

有什么想法吗？我们也没有得到完整的确认，但我们认为问题也会在几天后（目前是测试服务器没有任何活动）在没有部署的情况下自行出现。

原文

We're having issues with our celery daemon being very flaky. We use a fabric deployment script to restart the daemon whenever we push changes, but for some reason this is causing massive issues.

Whenever the deployment script is run the celery processes are left in some pseudo dead state. They will (unfortunately) still consume tasks from rabbitmq, but they won't actually do anything. Confusingly a brief inspection would indicate everything seems to be "fine" in this state, celeryctl status shows one node online and ps aux | grep celery shows 2 running processes.

However, attempting to run /etc/init.d/celeryd stop manually results in the following error:

start-stop-daemon: warning: failed to kill 30360: No such process

While in this state attempting to run celeryd start appears to work correctly, but in fact does nothing. The only way to fix the issue is to manually kill the running celery processes and then start them again.

Any ideas what's going on here? We also don't have complete confirmation, but we think the problem also develops after a few days (with no activity this is a test server currently) on it's own with no deployment.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

み青杉依旧 2024-11-24 21:13:34

我不能说我知道你的设置出了什么问题，但我一直使用supervisord来运行celery——也许这个问题与暴发户有关？不管怎样，我从来没有经历过在supervisord 上运行celery 的情况。

为了更好地衡量，这里有一个 celery 的主管配置示例：

[program:celeryd]
directory=/path/to/project/
command=/path/to/project/venv/bin/python manage.py celeryd -l INFO
user=nobody
autostart=true
autorestart=true
startsecs=10
numprocs=1
stdout_logfile=/var/log/sites/foo/celeryd_stdout.log
stderr_logfile=/var/log/sites/foo/celeryd_stderr.log

; Need to wait for currently executing tasks to finish at shutdown.
; Increase this if you have very long running tasks.
stopwaitsecs = 600

在我的 fab 脚本中重新启动 celeryd 就像发出 sudosupervisorctl restart celeryd 一样简单。

I can't say that I know what's ailing your setup, but I've always used supervisord to run celery -- maybe the issue has to do with upstart? Regardless, I've never experienced this with celery running on top of supervisord.

For good measure, here's a sample supervisor config for celery:

[program:celeryd]
directory=/path/to/project/
command=/path/to/project/venv/bin/python manage.py celeryd -l INFO
user=nobody
autostart=true
autorestart=true
startsecs=10
numprocs=1
stdout_logfile=/var/log/sites/foo/celeryd_stdout.log
stderr_logfile=/var/log/sites/foo/celeryd_stderr.log

; Need to wait for currently executing tasks to finish at shutdown.
; Increase this if you have very long running tasks.
stopwaitsecs = 600

Restarting celeryd in my fab script is then as simple as issuing a sudo supervisorctl restart celeryd.

回复收藏 0 原文

~没有更多了~