如何用 Celery 保证消息传递？

发布于 2024-11-18 14:10:12 字数 890 浏览 6 评论 0原文

我有一个 python 应用程序，我想开始在后台做更多工作，以便它在变得更忙时能够更好地扩展。过去我使用 Celery 来执行正常的后台任务，效果很好。

该应用程序与我过去所做的其他应用程序之间的唯一区别是，我需要保证这些消息得到处理，并且不会丢失。

对于这个应用程序，我不太关心消息队列的速度，我首先需要可靠性和耐用性。为了安全起见，我希望拥有两台队列服务器，它们都位于不同的数据中心，以防出现问题，其中一台是另一台的备份。

看看 Celery，它看起来支持一堆不同的后端，其中一些比其他后端具有更多功能。两个最流行的看起来像是 redis 和 RabbitMQ，所以我花了一些时间进一步研究它们。

RabbitMQ： 支持持久队列和集群，但目前集群方式的问题是，如果您丢失集群中的一个节点，则该节点中的所有消息都将不可用，直到您将该节点重新联机为止。它不会在集群中的不同节点之间复制消息，它只是复制有关消息的元数据，然后它返回到原始节点来获取消息，如果该节点没有运行，那么您是 SOL 不理想的。

他们建议解决这个问题的方法是设置第二台服务器并使用 DRBD 复制文件系统，然后运行诸如pacemaker之类的东西在需要时将客户端切换到备份服务器。看起来很复杂，不知道是否有更好的方法。有人知道更好的方法吗？

Redis： 支持读取从属设备，这将允许我在紧急情况下进行备份，但它不支持主-主设置，并且我不确定它是否可以处理主从设备之间的主动故障转移。它没有与 RabbitMQ 相同的功能，但看起来更容易设置和维护。

问题：

设置 celery 的最佳方法是什么这样才能保证消息处理。
以前有人这样做过吗？如果是这样，介意分享一下你做了什么吗？

原文

I have a python application where I want to start doing more work in the background so that it will scale better as it gets busier. In the past I have used Celery for doing normal background tasks, and this has worked well.

The only difference between this application and the others I have done in the past is that I need to guarantee that these messages are processed, they can't be lost.

For this application I'm not too concerned about speed for my message queue, I need reliability and durability first and formost. To be safe I want to have two queue servers, both in different data centers in case something goes wrong, one a backup of the other.

Looking at Celery it looks like it supports a bunch of different backends, some with more features then the others. The two most popular look like redis and RabbitMQ so I took some time to examine them further.

RabbitMQ:
Supports durable queues and clustering, but the problem with the way they have clustering today is that if you lose a node in the cluster, all messages in that node are unavailable until you bring that node back online. It doesn't replicated the messages between the different nodes in the cluster, it just replicates the metadata about the message, and then it goes back to the originating node to get the message, if the node isn't running, you are S.O.L. Not ideal.

The way they recommend to get around this is to setup a second server and replicate the file system using DRBD, and then running something like pacemaker to switch the clients to the backup server when it needs too. This seems pretty complicated, not sure if there is a better way. Anyone know of a better way?

Redis:
Supports a read slave and this would allow me to have a backup in case of emergencies but it doesn't support master-master setup, and I'm not sure if it handles active failover between master and slave. It doesn't have the same features as RabbitMQ, but looks much easier to setup and maintain.

Questions:

What is the best way to setup celery
so that it will guarantee message
processing.
Has anyone done this before? If so,
would be mind sharing what you did?

分享到QQ

分享到微博