Python,多线程太慢,多进程
我是多处理新手,
我了解一些有关线程的知识,但我需要提高计算速度,希望通过多处理:
示例描述:将字符串发送到线程,更改字符串+基准测试, 将结果发送回打印。
from threading 导入 Thread 类更改(线程): def __init__(自我, 词): 线程.__init__(自身) self.word = 单词 self.word2 = '' def 运行(自我): # 改变字符串+测试处理速度 对于范围内的 i(80000): 自我.word2 = 自我.word2 + 自我.word # 发送要修改的字符串 线程1 = 改变('foo') 线程2 = 改变('酒吧') 线程1.start() 线程2.start() #等待两者都完成 while thread1.is_alive() == True: 通过 while thread2.is_alive() == True: 通过 打印(线程1.字2) 打印(线程2.word2)
目前这大约需要 6 秒,我需要它跑得更快。
我一直在研究多重处理,但找不到与上述代码等效的东西。我认为我追求的是池化,但我发现的例子很难理解。我想利用所有核心(8 核)multiprocessing.cpu_count()
,但我实际上只有有关多处理的有用信息的碎片,不足以复制上述代码。如果有人能指出我正确的方向或更好的方向,请提供一个例子,我将不胜感激。请使用Python 3
I'm a multiprocessing newbie,
I know something about threading but I need to increase the speed of this calculation, hopefully with multiprocessing:
Example Description: sends string to a thread, alters string + benchmark test,
send result back for printing.from threading import Thread class Alter(Thread): def __init__(self, word): Thread.__init__(self) self.word = word self.word2 = '' def run(self): # Alter string + test processing speed for i in range(80000): self.word2 = self.word2 + self.word # Send a string to be altered thread1 = Alter('foo') thread2 = Alter('bar') thread1.start() thread2.start() #wait for both to finish while thread1.is_alive() == True: pass while thread2.is_alive() == True: pass print(thread1.word2) print(thread2.word2)
This is currently takes about 6 seconds and I need it to go faster.
I have been looking into multiprocessing and cannot find something equivalent to the above code. I think what I am after is pooling but examples I have found have been hard to understand. I would like to take advantage of all cores (8 cores) multiprocessing.cpu_count()
but I really just have scraps of useful information on multiprocessing and not enough to duplicate the above code. If anyone can point me in the right direction or better yet, provide an example that would be greatly appreciated. Python 3 please
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
只需将
threading
替换为multiprocessing
,将Thread
替换为Process
即可。 Pyton 中的线程(几乎)从未用于获得性能,因为 GIL 太糟糕了!我在另一篇 SO-post 中解释了它,其中包含一些文档链接和 关于 Python 中的线程的精彩讨论。但是 多处理模块有意与线程模块非常相似。您几乎可以将它用作直接替代品!
据我所知,多处理模块不提供强制使用特定数量核心的功能。它依赖于操作系统的实现。您可以使用 Pool 对象并将工作对象限制为核心计数。或者您可以寻找其他 MPI 库,例如 pypar。在 Linux 下,您可以在 shell 下使用管道来启动不同内核上的多个实例
Just replace
threading
withmultiprocessing
andThread
withProcess
. Threads in Pyton are (almost) never used to gain performance because of the big bad GIL! I explained it in an another SO-post with some links to documentation and a great talk about threading in python.But the multiprocessing module is intentionally very similar to the threading module. You can almost use it as an drop-in replacement!
The multiprocessing module doesn't AFAIK offer a functionality to enforce the use of a specific amount of cores. It relies on the OS-implementation. You could use the Pool object and limit the worker-onjects to the core-count. Or you could look for an other MPI library like pypar. Under Linux you could use a pipe under the shell to start multiple instances on different cores