Celery 在本地系统上是否与 python 多处理一样高效?
我在决定为我的应用程序使用 python 多重处理、celery 或 pp 时遇到了一些麻烦。
我的应用程序占用大量 CPU,但目前仅使用一个 cpu,因此,我需要将其分布在所有可用的 cpu 上(这导致我查看了 python 的多处理库),但我读到,如果需要,该库无法扩展到其他计算机。现在我不确定是否需要多个服务器来运行我的代码,但我正在考虑在本地运行 celery,然后扩展只需要添加新服务器而不是重构代码(就像我使用的那样)多处理)。
我的问题是:这个逻辑正确吗?在本地使用 celery 是否有任何负面(性能)(如果事实证明具有多个核心的单个服务器可以完成我的任务)?还是更建议使用多处理并稍后将其发展为其他东西?
谢谢!
PS这是一个个人学习项目,但也许有一天我会想在一家公司担任开发人员,并想了解专业人士是如何做到这一点的。
I'm having a bit of trouble deciding whatever to use python multiprocessing or celery or pp for my application.
My app is very CPU heavy but currently uses only one cpu so, I need to spread it across all available cpus(which caused me to look at python's multiprocessing library) but I read that this library doesn't scale to other machines if required. Right now I'm not sure if I'll need more than one server to run my code but I'm thinking of running celery locally and then scaling would only require adding new servers instead of refactoring the code(as it would if I used multiprocessing).
My question: is this logic correct? and is there any negative(performance) with using celery locally(if it turns out a single server with multiple cores can complete my task)? or is it more advised to use multiprocessing and grow out of it into something else later?
Thanks!
p.s. this is for a personal learning project but I would maybe one day like to work as a developer in a firm and want to learn how professionals do it.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
我刚刚完成了一个测试,以确定 celery 在 multiprocessing.Pool 和共享数组上增加了多少开销。该测试在 (292, 353, 1652) uint16 数组上运行维纳滤波器。两个版本都使用相同的分块(粗略地:将 292,353 个维度除以可用 cpu 数量的平方根)。尝试了两个 celery 版本:一种解决方案发送腌制数据,另一种解决方案在每个工作人员中打开底层数据文件。
结果:在我的 16 核 i7 CPU 上,celery 大约需要 16 秒,带有共享数组的
multiprocessing.Pool
大约需要 15 秒。我发现这种差异出奇地小。增加粒度会明显增加差异(celery 必须传递更多消息):celery 需要 15 秒,
multiprocessing.Pool
需要 12 秒。考虑到芹菜工作人员已经在主机上运行,而池工作人员在每次运行时都会分叉。我不确定如何从一开始就启动多处理池,因为我在初始化程序中传递了共享数组:
并且只有重新数组受锁定保护。
I just finished a test to decide how much celery adds as overhead over
multiprocessing.Pool
and shared arrays. The test runs the wiener filter on a (292, 353, 1652) uint16 array. Both versions use the same chunking (roughly:divide the 292,353 dimensions by the square root of the number of available cpu's). Two celery versions were tried: one solution sends pickled data the other opens the underlying data file in every worker.Result: on my 16 core i7 CPU celery takes about 16s,
multiprocessing.Pool
with shared arrays about 15s. I find this difference surprisingly small.Increasing granularity increases the difference obviously (celery has to pass more messages): celery takes 15 s,
multiprocessing.Pool
takes 12s.Take into account that celery workers were already running on the host whereas the pool workers are forked at each run. I am not sure how could I start multiprocessing pool at the beginning since I pass the shared arrays in the initializer:
and only the resarrays are protected by locking.
我实际上从未使用过 Celery,但我使用过多重处理。
Celery 似乎有几种传递消息(任务)的方法,包括您应该能够在不同机器上运行工作程序的方法。因此,缺点可能是消息传递可能比多处理慢,但另一方面,您可以将负载分散到其他机器上。
你是对的,多处理只能在一台机器上运行。但另一方面,进程之间的通信可以非常快,例如通过使用共享内存。此外,如果您需要处理大量数据,您可以轻松地从本地磁盘读取数据并将数据写入本地磁盘,只需在进程之间传递文件名即可。
我不知道 Celery 处理任务失败的情况如何。例如,任务可能永远不会完成运行,或者可能崩溃,或者您可能希望能够在任务未在特定时间限制内完成时终止该任务。我不知道如果不存在的话添加对此的支持会有多困难。
多处理并不具有开箱即用的容错能力,但您可以自己构建它,而不会有太多麻烦。
I have actually never used Celery, but I have used multiprocessing.
Celery seems to have several ways to pass messages (tasks) around, including ways that you should be able to run workers on different machines. So a downside might be that message passing could be slower than with multiprocessing, but on the other hand you could spread the load to other machines.
You are right that multiprocessing can only run on one machine. But on the other hand, communication between the processes can be very fast, for example by using shared memory. Also if you need to process very large amounts of data, you could easily read and write data from and to the local disk, and just pass filenames between the processes.
I don't know how well Celery would deal with task failures. For example, task might never finish running, or might crash, or you might want to have the ability to kill a task if it did not finish in certain time limit. I don't know how hard it would be to add support for that if it is not there.
multiprocessing does not come with fault tolerance out of the box, but you can build that yourself without too much trouble.