Python 中的并行性
在 Python 中实现并行性的选项有哪些?我想要对一些非常大的栅格执行大量 CPU 密集型计算,并且想要并行化它们。由于具有 C 语言背景,我熟悉三种并行方法:
- 消息传递进程,可能分布在集群中,例如MPI。
- 显式共享内存并行,使用pthreads或fork()、pipe()等。 al
- 使用OpenMP实现隐式共享内存并行。
决定使用方法是一种权衡的过程。
在 Python 中,有哪些可用的方法以及它们的特点是什么?是否有可集群的MPI克隆?实现共享内存并行性的首选方法是什么?我听说过有关GIL 的问题,以及tasklet 的问题。
简而言之,在选择 Python 中的不同并行化策略之前,我需要了解哪些内容?
What are the options for achieving parallelism in Python? I want to perform a bunch of CPU bound calculations over some very large rasters, and would like to parallelise them. Coming from a C background, I am familiar with three approaches to parallelism:
- Message passing processes, possibly distributed across a cluster, e.g. MPI.
- Explicit shared memory parallelism, either using pthreads or fork(), pipe(), et. al
- Implicit shared memory parallelism, using OpenMP.
Deciding on an approach to use is an exercise in trade-offs.
In Python, what approaches are available and what are their characteristics? Is there a clusterable MPI clone? What are the preferred ways of achieving shared memory parallelism? I have heard reference to problems with the GIL, as well as references to tasklets.
In short, what do I need to know about the different parallelization strategies in Python before choosing between them?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
通常,您描述的是 CPU 密集型计算。这不是Python的强项。从历史上看,两者都不是多处理。
主流 Python 解释器中的线程一直受到可怕的全局锁的统治。新的 multiprocessing API 解决了这个问题,并提供了带有管道和队列等的工作池抽象。
您可以使用 C 或 Cython,并使用 Python 作为粘合剂。
Generally, you describe a CPU bound calculation. This is not Python's forte. Neither, historically, is multiprocessing.
Threading in the mainstream Python interpreter has been ruled by a dreaded global lock. The new multiprocessing API works around that and gives a worker pool abstraction with pipes and queues and such.
You can write your performance critical code in C or Cython, and use Python for the glue.
新的 (2.6) multiprocessing 模块是正确的选择。它使用子进程,解决了GIL问题。它还抽象了一些本地/远程问题,因此可以稍后选择在本地运行代码或分布在集群上。我上面链接的文档需要仔细阅读,但应该为入门提供良好的基础。
The new (2.6) multiprocessing module is the way to go. It uses subprocesses, which gets around the GIL problem. It also abstracts away some of the local/remote issues, so the choice of running your code locally or spread out over a cluster can be made later. The documentation I've linked above is a fair bit to chew on, but should provide a good basis to get started.
Ray 是一个用于执行此操作的优雅(且快速)的库。
并行化 Python 函数的最基本策略是使用 @ray.remote 装饰器声明函数。然后就可以异步调用了。
您还可以使用 actors 并行化有状态计算,再次使用
@ray.remote< /code> 装饰器。
与 multiprocessing 模块相比,它具有许多优点:
Ray 是我一直在帮助开发的一个框架。
Ray is an elegant (and fast) library for doing this.
The most basic strategy for parallelizing Python functions is to declare a function with the
@ray.remote
decorator. Then it can be invoked asynchronously.You can also parallelize stateful computation using actors, again by using the
@ray.remote
decorator.It has a number of advantages over the multiprocessing module:
Ray is a framework I've been helping develop.
根据您需要处理的数据量以及您打算使用多少个 CPU/机器,在某些情况下最好用 C(如果您想使用 jython/IronPython,则使用 Java/C#)编写一部分数据
。您可以发现,这可能比在 8 个 CPU 上并行运行更能提高性能。
Depending on how much data you need to process and how many CPUs/machines you intend to use, it is in some cases better to write a part of it in C (or Java/C# if you want to use jython/IronPython)
The speedup you can get from that might do more for your performance than running things in parallel on 8 CPUs.
有很多包可以做到这一点,正如其他人所说,最合适的是多处理,特别是“Pool”类。
通过 parallel python 可以获得类似的结果,此外它还被设计用于集群。
不管怎样,我会说使用多处理。
There are many packages to do that, the most appropriate as other said is multiprocessing, expecially with the class "Pool".
A similar result can be obtained by parallel python, that in addition is designed to work with clusters.
Anyway, I would say go with multiprocessing.