使用 celery 作为容错调度程序
我想在分布式环境中使用 celery w/rabbitmq 作为容错调度程序。 通过容错,我的意思是,如果将任务分配给工作人员并且该工作人员由于某种原因而停机,则 celery 应该能够将其重新安排到另一台服务器。 在有多个工作节点的环境中如何实现这一点?
I would like to use celery w/ rabbitmq as a fault tolerant scheduler in a distributed environment.
By fault tolerant, i mean that if a task is given to a worker and that worker goes down for whatever reason, celery should be able to reschedule it to another server.
How is it possible to achieve this in an environment where there are multiple worker nodes?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
也许您所需要的只是设置 CELERY_ACKS_LATE
Late ack 意味着任务消息将在任务执行后确认,而不是在任务执行之前确认,这是默认行为。这样如果worker崩溃rabbit MQ仍然有消息。
这里有更多信息
重试丢失或失败的任务(Celery 、Django 和 RabbitMQ)
Probably all you need is just to set CELERY_ACKS_LATE
Late ack means the task messages will be acknowledged after the task has been executed, not just before, which is the default behaviour. In this way if the worker crash rabbit MQ still have the message.
Here more info
Retry Lost or Failed Tasks (Celery, Django and RabbitMQ)
让每个工作人员从同一个队列中消费,Rabbit 会将消息循环发送给工作人员(消费者)。如果其中任何一个在处理作业时失败,并且在有机会发送确认之前,消息将自动放回到队列中,下一个工作人员将拾取它。这是“至少一次”交付模式。
RabbitMQ 站点的此链接解释了该模式并包含 Python 示例代码。
Have each of the workers consume from the same queue, and Rabbit will round-robin the messages to the workers (consumers). If any one of them fails while processing a job and before it had a chance to send its acknowledgment, the message will be automatically placed back on the queue and the next worker will pick it up. This is an "at least once" delivery pattern.
This link from the RabbitMQ site explains the pattern and includes Python sample code.