如何在 Python 多处理池中运行清理代码?
我有一些 Python 代码(在 Windows 上),它使用多处理模块来运行工作进程池。每个工作进程都需要在 map_async
方法结束时进行一些清理。
有谁知道该怎么做?
I have some Python code (on Windows) that uses the multiprocessing module to run a pool of worker processes. Each worker process needs to do some cleanup at the end of the map_async
method.
Does anyone know how to do that?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
您是否真的想为每个工作进程运行一次清理函数,而不是为
map_async
调用创建的每个任务运行一次?multiprocess.pool.Pool
创建一个包含 8 个工作进程的池。map_async
可能会提交 40 个任务以分配给 8 个工作人员。我可以想象为什么您可能想要在每个任务结束时运行清理代码,但我很难想象为什么您想要在 8 个工作进程中的每一个都完成之前运行清理代码。
不过,如果这就是您想要做的,您可以通过 Monkeypatching
multiprocessing.pool.worker
来实现:
Do you really want to run a cleanup function once for each worker process rather than once for every task created by the
map_async
call?multiprocess.pool.Pool
creates a pool of, say, 8 worker processes.map_async
might submit 40 tasks to be distributed among the 8 workers.I can imagine why you might want to run cleanup code at the end of each task, but I'm having trouble imagining why you would want to run cleanup code just before each of the 8 worker processes is finalized.
Nevertheless, if that is what you want to do, you could do it by monkeypatching
multiprocessing.pool.worker
:yields:
这里唯一真正的选择是在
map_async
到的函数末尾运行清理。如果此清理确实是为了进程死亡,则不能使用池的概念。它们是正交的。除非您使用
maxtasksperchild
,这是 Python 2.7 中的新功能。即使如此,您也无法在进程死亡时运行代码。但是,maxtasksperchild
可能适合您,因为当进程终止时,进程打开的任何资源肯定会消失。话虽这么说,如果您有一堆需要运行清理的函数,您可以通过设计装饰器来节省重复的工作。这是我的意思的一个例子:
当你执行这个(除非
stdout
被弄乱,因为我没有为了简洁而将它锁定在这里),你取出东西的顺序应该表明你的清理任务正在运行在每个任务结束时:Your only real option here is to run cleanup at the end of the function you
map_async
to.If this cleanup is honestly intended for at process death, you cannot use the concept of a pool. They are orthogonal. A pool does not dictate the process lifetime unless you use
maxtasksperchild
, which is new in Python 2.7. Even then, you do not gain the ability to run code at process death. However,maxtasksperchild
might suit you, because any resources that the process opens will definitely go away when the process is terminated.That being said, if you have a bunch of functions that you need to run cleanup on, you can save duplication of effort by designing a decorator. Here's an example of what I mean:
When you execute this (barring
stdout
being jumbled because I'm not locking it here for brevity), the order you get things out should indicate that your cleanup task is running at the end of each task: