如何在 python 中使用 C 扩展来绕过 GIL
我想在 Python 中跨多个核心运行一个 cpu 密集型程序,并试图找出如何编写 C 扩展来做到这一点。有这方面的代码示例或教程吗?
I want to run a cpu intensive program in Python across multiple cores and am trying to figure out how to write C extensions to do this. Are there any code samples or tutorials on this?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
多处理很容易。如果那还不够快,那么你的问题很复杂。
multiprocessing is easy. if thats not fast enough, your question is complicated.
您已经可以将一个 Python 程序分解为多个进程。操作系统已经将您的进程分配给所有核心。
这样做。
操作系统将确保该部分使用尽可能多的资源。您可以通过在
sys.stdin
和sys.stdout
上使用cPickle
沿着该管道轻松传递信息。无需太多工作,这通常可以带来显着的加速。
是的——对于仇恨者来说——构建一个如此折磨的算法是可能的,以至于它可能不会加速太多。然而,这通常会以最少的工作带来巨大的好处。
和。
为此目的进行的重组将完全匹配最大化线程并发性所需的重组。所以。从无共享进程并行开始,直到可以证明共享更多数据会有所帮助,然后转向更复杂的共享所有线程并行。
You can already break a Python program into multiple processes. The OS will already allocate your processes across all the cores.
Do this.
The OS will assure that part uses as many resources as possible. You can trivially pass information along this pipeline by using
cPickle
onsys.stdin
andsys.stdout
.Without too much work, this can often lead to dramatic speedups.
Yes -- to the haterz -- it's possible to construct an algorithm so tortured that it may not be sped up much. However, this often yields huge benefits for minimal work.
And.
The restructuring for this purpose will exactly match the restructuring required to maximize thread concurrency. So. Start with shared-nothing process parallelism until you can prove that sharing more data would help, then move to the more complex shared-everything thread parallelism.
看一下多处理。一个经常被忽视的事实是,操作系统更喜欢不全局共享数据,也不将大量线程塞进单个进程。
如果您仍然坚持认为 CPU 密集型行为需要线程,请查看 在 C 中使用 GIL。这是非常有用的。
Take a look at multiprocessing. It's an often overlooked fact that not globally sharing data, and not cramming loads of threads into a single process is what operating systems prefer.
If you still insist that your CPU intensive behaviour requires threading, take a look at the documentation for working with the GIL in C. It's quite informative.
这是C扩展的一个很好的用途。您应该搜索的关键字是
Py_BEGIN_ALLOW_THREADS
。http://docs. python.org/c-api/init.html#thread-state-and-the-global-interpreter-lock
PS我的意思是如果你的处理已经在C中,比如图像处理,那么释放C中的锁扩展性很好。如果你的处理代码主要是Python,其他人对
multiprocessing
的建议更好。通常没有理由用 C 重写代码进行后台处理。This is a good use of C extension. The keyword you should search for is
Py_BEGIN_ALLOW_THREADS
.http://docs.python.org/c-api/init.html#thread-state-and-the-global-interpreter-lock
P.S. I mean if you processing is already in C, like imaging processing, then release the lock in C extension is good. If your processing code is mainly in Python, other people's suggestion to
multiprocessing
is better. It is usually not justify to rewrite the code in C for background processing.您是否考虑过使用 python mpi 库之一,例如 mpi4py?尽管 MPI 通常用于跨集群分配工作,但它在单个多核计算机上运行得很好。缺点是您必须重构代码才能使用 MPI 的通信调用(这可能很容易)。
Have you considered using one of the python mpi libraries like mpi4py? Although MPI is normally used to distribute work across a cluster, it works quite well on a single multicore machine. The downside is that you'll have to refactor your code to use MPI's communication calls (which may be easy).