您可以使用Python中的多处理进行嵌套并行化吗?
我是Python中多处理的新手,我正在尝试做以下操作:
import os
from multiprocessing import Pool
from random import randint
def example_function(a):
new_numbers = [randint(1, a) for i in range(0, 50)]
with Pool(processes=os.cpu_count()-1) as pool:
results = pool.map(str, new_numbers)
return results
if __name__ == '__main__':
numbers = [randint(1, 50) for i in range(0, 50)]
with Pool(processes=os.cpu_count()) as pool:
results = pool.map(example_function, numbers)
print("Final results:", results)
但是,当运行此操作时,我会得到:“ OssertionError:不允许守护程序可以生育孩子”。
互换pool.map
for for loop确实使其正常工作。例如,第二个:
results = []
for n in numbers:
results.append(example_function(n))
但是,由于外部任务和内部任务都非常密集,因此我希望能够同时平行这两者。我该怎么做?
I am new to multiprocessing in Python and I am trying to do the following:
import os
from multiprocessing import Pool
from random import randint
def example_function(a):
new_numbers = [randint(1, a) for i in range(0, 50)]
with Pool(processes=os.cpu_count()-1) as pool:
results = pool.map(str, new_numbers)
return results
if __name__ == '__main__':
numbers = [randint(1, 50) for i in range(0, 50)]
with Pool(processes=os.cpu_count()) as pool:
results = pool.map(example_function, numbers)
print("Final results:", results)
However, when running this I get: "AssertionError: daemonic processes are not allowed to have children".
Interchanging either pool.map
for a for loop does make it work. E.g. for the second one:
results = []
for n in numbers:
results.append(example_function(n))
However, since both the outer and inner tasks are very intensive I would like to be able to parallelize both. How can I do this?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
Multiprocessing.pool
使用daemon
标志设置为true
创建进程。根据 python文档process
class> class class ,这阻止了在工作过程中创建子过程:从理论上讲,您可以创建自己的池,并使用绕过过程创建以创建非传说过程的自定义上下文。但是,您不应该这样做,因为如文档中所述,流程的终止将不安全。
实际上,在池中创建池并不是一个好主意,因为池的每个过程都会创建另一个流程。这导致创建许多过程非常低效。在某些情况下,对于操作系统而言,该过程的数量将太大,无法创建它们(平台的限制取决于限制)。例如,在许多核心处理器上AMD ThreadRipper处理器使用128个线程,进程总数将为
128 * 128 = 16384
,显然是不合理的。解决此问题的通常解决方案是推理任务而不是处理。可以将任务添加到共享队列,因此工人可以计算钉子,然后工人可以通过在共享队列中添加新任务来产生新任务。 afaik,多处理管理器对设计这样的系统很有用。
multiprocessing.Pool
creates processes with thedaemon
flag set toTrue
. According to the Python documentation of theProcess
class, this prevent sub-processes to be created in worker processes:Theoretically, you can create your own pool and use a custom context that bypass the process creation to create non-daemonic process. However, you should not do that because the termination of processes would be unsafe as stated in the documentation.
In fact, creating pools in pools is not a good idea in practice as each process of the pool will create another pool of processes. This results in a lot of processes being created which is very inefficient. In some cases, the number of processes would be too big for the OS to be able to create them (there is a limit dependent of the platform). For example, on a many core processor like a recent 64-core AMD threadripper processor with 128 threads, the total number of processes will be
128 * 128 = 16384
which is clearly not reasonable.The usual solution to solve this problem is to reason about tasks and not processes. Tasks can be added to a shared queue, so tacks can be computed by workers, and then workers can spawn new tasks by adding new tasks in the shared queue. AFAIK, multiprocessing managers are useful to design such a system.