更好的多线程使用Python subprocess.Popen &交流()?
我正在运行多个命令,在运行 Python 2.6 的 Linux 机器上并行运行可能需要一些时间。
因此,我使用 subprocess.Popen
类和 process.communicate()
方法来并行执行多个命令组,并在执行后立即捕获输出。
def run_commands(commands, print_lock):
# this part runs in parallel.
outputs = []
for command in commands:
proc = subprocess.Popen(shlex.split(command), stdout=subprocess.PIPE, stderr=subprocess.STDOUT, close_fds=True)
output, unused_err = proc.communicate() # buffers the output
retcode = proc.poll() # ensures subprocess termination
outputs.append(output)
with print_lock: # print them at once (synchronized)
for output in outputs:
for line in output.splitlines():
print(line)
在其他地方,它是这样调用的:
processes = []
print_lock = Lock()
for ...:
commands = ... # a group of commands is generated, which takes some time.
processes.append(Thread(target=run_commands, args=(commands, print_lock)))
processes[-1].start()
for p in processes: p.join()
print('done.')
预期的结果是一组命令的每个输出立即显示,而它们的执行是并行完成的。
但是从第二个输出组(当然,由于调度不确定性,成为第二个的线程发生了变化),它开始打印而不换行并添加空格,其数量与前一行打印的字符数相同,并且输入回显被翻转off——最终状态为“乱码”或“崩溃”。 (如果我发出 reset
shell 命令,它就会恢复正常。)
起初,我尝试从 '\r'
的处理中查找原因,但并不是原因。正如您在我的代码中看到的,我使用 splitlines()
正确处理了它,并且我确认了将 repr()
函数应用于输出。
我认为原因是在 stdout/stderr 的 Popen
和 communicate()
中同时使用管道。我尝试了Python 2.7中的check_output
快捷方法,但没有成功。当然,如果我序列化所有命令执行和打印,就不会出现上述问题。
有没有更好的方法来并行处理 Popen
和 communicate()
?
I'm running multiple commands which may take some time, in parallel, on a Linux machine running Python 2.6.
So, I used subprocess.Popen
class and process.communicate()
method to parallelize execution of mulitple command groups and capture the output at once after execution.
def run_commands(commands, print_lock):
# this part runs in parallel.
outputs = []
for command in commands:
proc = subprocess.Popen(shlex.split(command), stdout=subprocess.PIPE, stderr=subprocess.STDOUT, close_fds=True)
output, unused_err = proc.communicate() # buffers the output
retcode = proc.poll() # ensures subprocess termination
outputs.append(output)
with print_lock: # print them at once (synchronized)
for output in outputs:
for line in output.splitlines():
print(line)
At somewhere else it's called like this:
processes = []
print_lock = Lock()
for ...:
commands = ... # a group of commands is generated, which takes some time.
processes.append(Thread(target=run_commands, args=(commands, print_lock)))
processes[-1].start()
for p in processes: p.join()
print('done.')
The expected result is that each output of a group of commands is displayed at once while execution of them is done in parallel.
But from the second output group (of course, the thread that become the second is changed due to scheduling indeterminism), it begins to print without newlines and adding spaces as many as the number of characters printed in each previous line and input echo is turned off -- the terminal state is "garbled" or "crashed". (If I issue reset
shell command, it restores normal.)
At first, I tried to find the reason from handling of '\r'
, but it was not the reason. As you see in my code, I handled it properly using splitlines()
, and I confirmed that with repr()
function applied to the output.
I think the reason is concurrent use of pipes in Popen
and communicate()
for stdout/stderr. I tried check_output
shortcut method in Python 2.7, but no success. Of course, the problem described above does not occur if I serialize all command executions and prints.
Is there any better way to handle Popen
and communicate()
in parallel?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
最终结果受到 JFSebastian 评论的启发。
http://bitbucket.org/daybreaker/kaist-cs443/src /247f9ecf3cee/tools/manage.py
这似乎是一个Python错误。
A final result inspired by the comment from J.F.Sebastian.
http://bitbucket.org/daybreaker/kaist-cs443/src/247f9ecf3cee/tools/manage.py
It seems to be a Python bug.
我不确定 run_commands 实际需要做什么,但它似乎只是对子进程进行轮询,忽略返回代码并继续循环。当您到达打印输出的部分时,您如何知道子流程已完成?
I am not sure it is clear what run_commands needs to be actually doing, but it seems to be simply doing a poll on a subprocess, ignoring the return-code and continuing in the loop. When you get to the part where you are printing output, how could you know the sub-processes have completed?
在您的示例代码中,我注意到您使用了:
来部分解决“ /r ”的问题;使用
会有所帮助。
In your example code I noticed your use of:
to address partially the issue of " /r " ; use of
would have been helpful.