Python 多处理从不加入
我正在使用多处理
,特别是池
来分离几个“线程”来完成我拥有的一堆缓慢的工作。然而,由于某种原因,我无法让主线程重新加入,即使所有的孩子似乎都已经死了。
已解决:看来这个问题的答案是仅启动多个 Process
对象,而不是使用 Pool
。目前尚不清楚原因,但我怀疑剩余的进程是池的管理器,并且当进程完成时它不会消失。如果其他人也遇到此问题,这就是答案。
主线程
pool = Pool(processes=12,initializer=thread_init)
for x in xrange(0,13):
pool.apply_async(thread_dowork)
pool.close()
sys.stderr.write("Waiting for jobs to terminate\n")
pool.join()
xrange(0,13)
比进程的数量,因为我认为我有一个进程,并且一个进程没有得到一份工作,所以没有死亡,我想强迫它接受一份工作。我也尝试过 12 个。
多处理函数
def thread_init():
global log_out
log_out = open('pool_%s.log'%os.getpid(),'w')
sys.stderr = log_out
sys.stdout = log_out
log_out.write("Spawned")
log_out.flush()
log_out.write(" Complete\n")
log_out.flush()
def thread_dowork():
log_out.write("Entered function\n")
log_out.flush()
#Do Work
log_out.write("Exiting ")
log_out.flush()
log_out.close()
sys.exit(0)
所有 12 个子进程的日志文件的输出是:
Spawned
Complete
Entered function
Exiting
主线程打印“等待作业终止”,然后就坐在那里。
top
仅显示脚本的一份副本(我认为是主要副本)。 htop
显示两个副本,其中一个是顶部的副本,另一个是其他副本。根据其 PID,它也不是子级。
有人知道我不知道的事情吗?
I'm using multiprocessing
, and specifically a Pool
to spin off a couple of 'threads' to do a bunch of slow jobs that I have. However, for some reason, I can't get the main thread to rejoin, even though all of the children appear to have died.
Resolved: It appears the answer to this question is to just launch multiple Process
objects, rather than using a Pool
. It's not abundantly clear why, but I suspect the remaining process is a manager for the pool and it's not dying when the processes finish. If anyone else has this problem, this is the answer.
Main Thread
pool = Pool(processes=12,initializer=thread_init)
for x in xrange(0,13):
pool.apply_async(thread_dowork)
pool.close()
sys.stderr.write("Waiting for jobs to terminate\n")
pool.join()
The xrange(0,13)
is one more than the number of processes because I thought I had an off by one, and one process wasn't getting a job, so wasn't dying and I wanted to force it to take a job. I have tried it with 12 as well.
Multiprocessing Functions
def thread_init():
global log_out
log_out = open('pool_%s.log'%os.getpid(),'w')
sys.stderr = log_out
sys.stdout = log_out
log_out.write("Spawned")
log_out.flush()
log_out.write(" Complete\n")
log_out.flush()
def thread_dowork():
log_out.write("Entered function\n")
log_out.flush()
#Do Work
log_out.write("Exiting ")
log_out.flush()
log_out.close()
sys.exit(0)
The output of the logfiles for all 12 children is:
Spawned
Complete
Entered function
Exiting
The main thread prints 'Waiting for jobs to terminate', and then just sits there.
top
shows only one copy of the script (the main one I believe). htop
shows two copies, one of which is the one from top, and the other one of which is something else. Based on its PID, it's none of the children either.
Does anyone know something I don't?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我真的没有答案,但我阅读了 Apply_async 的文档,它似乎与您所说的问题相反......
我对池不熟悉,但在我看来,您的用例可以通过 本周 Python 模块
I don't really have an answer but I read the docs for Apply_async and it seems counter to your stated problem...
I'm not familiar with the Pool but it seems to me that your use-case could easily be handled by this recipe on Python Module of the Week