更好的多线程使用Python subprocess.Popen &交流()?

发布于 2024-10-04 13:30:07 字数 1668 浏览 5 评论 0原文

我正在运行多个命令,在运行 Python 2.6 的 Linux 机器上并行运行可能需要一些时间。

因此,我使用 subprocess.Popen 类和 process.communicate() 方法来并行执行多个命令组,并在执行后立即捕获输出。

def run_commands(commands, print_lock):
    # this part runs in parallel.
    outputs = []
    for command in commands:
        proc = subprocess.Popen(shlex.split(command), stdout=subprocess.PIPE, stderr=subprocess.STDOUT, close_fds=True)
        output, unused_err = proc.communicate()  # buffers the output
        retcode = proc.poll()                    # ensures subprocess termination
        outputs.append(output)
    with print_lock: # print them at once (synchronized)
        for output in outputs:
            for line in output.splitlines():
                print(line)

在其他地方,它是这样调用的:

processes = []
print_lock = Lock()
for ...:
    commands = ...  # a group of commands is generated, which takes some time.
    processes.append(Thread(target=run_commands, args=(commands, print_lock)))
    processes[-1].start()
for p in processes: p.join()
print('done.')

预期的结果是一组命令的每个输出立即显示,而它们的执行是并行完成的。

但是从第二个输出组(当然,由于调度不确定性,成为第二个的线程发生了变化),它开始打印而不换行并添加空格,其数量与前一行打印的字符数相同,并且输入回显被翻转off——最终状态为“乱码”或“崩溃”。 (如果我发出 reset shell 命令,它就会恢复正常。)

起初,我尝试从 '\r' 的处理中查找原因,但并不是原因。正如您在我的代码中看到的,我使用 splitlines() 正确处理了它,并且我确认了将 repr() 函数应用于输出。

我认为原因是在 stdout/stderr 的 Popencommunicate() 中同时使用管道。我尝试了Python 2.7中的check_output快捷方法,但没有成功。当然,如果我序列化所有命令执行和打印,就不会出现上述问题。

有没有更好的方法来并行处理 Popencommunicate()

I'm running multiple commands which may take some time, in parallel, on a Linux machine running Python 2.6.

So, I used subprocess.Popen class and process.communicate() method to parallelize execution of mulitple command groups and capture the output at once after execution.

def run_commands(commands, print_lock):
    # this part runs in parallel.
    outputs = []
    for command in commands:
        proc = subprocess.Popen(shlex.split(command), stdout=subprocess.PIPE, stderr=subprocess.STDOUT, close_fds=True)
        output, unused_err = proc.communicate()  # buffers the output
        retcode = proc.poll()                    # ensures subprocess termination
        outputs.append(output)
    with print_lock: # print them at once (synchronized)
        for output in outputs:
            for line in output.splitlines():
                print(line)

At somewhere else it's called like this:

processes = []
print_lock = Lock()
for ...:
    commands = ...  # a group of commands is generated, which takes some time.
    processes.append(Thread(target=run_commands, args=(commands, print_lock)))
    processes[-1].start()
for p in processes: p.join()
print('done.')

The expected result is that each output of a group of commands is displayed at once while execution of them is done in parallel.

But from the second output group (of course, the thread that become the second is changed due to scheduling indeterminism), it begins to print without newlines and adding spaces as many as the number of characters printed in each previous line and input echo is turned off -- the terminal state is "garbled" or "crashed". (If I issue reset shell command, it restores normal.)

At first, I tried to find the reason from handling of '\r', but it was not the reason. As you see in my code, I handled it properly using splitlines(), and I confirmed that with repr() function applied to the output.

I think the reason is concurrent use of pipes in Popen and communicate() for stdout/stderr. I tried check_output shortcut method in Python 2.7, but no success. Of course, the problem described above does not occur if I serialize all command executions and prints.

Is there any better way to handle Popen and communicate() in parallel?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

云柯 2024-10-11 13:30:07

最终结果受到 JFSebastian 评论的启发。

http://bitbucket.org/daybreaker/kaist-cs443/src /247f9ecf3cee/tools/manage.py

这似乎是一个Python错误。

A final result inspired by the comment from J.F.Sebastian.

http://bitbucket.org/daybreaker/kaist-cs443/src/247f9ecf3cee/tools/manage.py

It seems to be a Python bug.

冰雪梦之恋 2024-10-11 13:30:07

我不确定 run_commands 实际需要做什么,但它似乎只是对子进程进行轮询,忽略返回代码并继续循环。当您到达打印输出的部分时,您如何知道子流程已完成?

I am not sure it is clear what run_commands needs to be actually doing, but it seems to be simply doing a poll on a subprocess, ignoring the return-code and continuing in the loop. When you get to the part where you are printing output, how could you know the sub-processes have completed?

撕心裂肺的伤痛 2024-10-11 13:30:07

在您的示例代码中,我注意到您使用了:

for line in output.splitlines(): 

来部分解决“ /r ”的问题;使用

for line in output.splitlines(True): 

会有所帮助。

In your example code I noticed your use of:

for line in output.splitlines(): 

to address partially the issue of " /r " ; use of

for line in output.splitlines(True): 

would have been helpful.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文