从 subprocess.communicate() 读取流输入
我正在使用 Python 的 subprocess.communicate()
从运行大约一分钟的进程中读取标准输出。
如何以流方式打印该进程的 stdout
的每一行,以便我可以看到生成的输出,但在继续之前仍然阻止进程终止?
subprocess.communicate()
似乎一次性给出了所有输出。
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(7)
要在子进程刷新其 stdout 缓冲区后立即逐行获取子进程的输出:
iter()
用于在写入行后立即读取行以解决方法 Python 2 中的预读错误。如果子进程的标准输出在非交互模式下使用块缓冲而不是行缓冲(这会导致输出延迟,直到子进程的缓冲区已满或由子进程显式刷新),那么您可以尝试使用强制无缓冲的输出
pexpect
、pty
模块 或unbuffer
、stdbuf
、script
实用程序,请参阅问:为什么不直接使用管道 (popen())?这里是 Python 3代码:
注意:与Python 2不同,Python 2按原样输出子进程的字节串; Python 3 使用文本模式(cmd 的输出使用
locale.getpreferredencoding(False)
编码进行解码)。To get subprocess' output line by line as soon as the subprocess flushes its stdout buffer:
iter()
is used to read lines as soon as they are written to workaround the read-ahead bug in Python 2.If subprocess' stdout uses a block buffering instead of a line buffering in non-interactive mode (that leads to a delay in the output until the child's buffer is full or flushed explicitly by the child) then you could try to force an unbuffered output using
pexpect
,pty
modules orunbuffer
,stdbuf
,script
utilities, see Q: Why not just use a pipe (popen())?Here's Python 3 code:
Note: Unlike Python 2 that outputs subprocess' bytestrings as is; Python 3 uses text mode (cmd's output is decoded using
locale.getpreferredencoding(False)
encoding).请注意,我认为JF Sebastian的方法(如下)更好。
这是一个简单的例子(不检查错误):
如果 ls 结束得太快,那么 while 循环可能会在您读取所有数据之前结束。
您可以通过以下方式捕获 stdout 中的余数:
Please note, I think J.F. Sebastian's method (below) is better.
Here is an simple example (with no checking for errors):
If
ls
ends too fast, then the while loop may end before you've read all the data.You can catch the remainder in stdout this way:
我相信以流式传输方式从进程收集输出的最简单方法如下:
readline() 或 read() 函数应该仅在 EOF 上返回空字符串,在进程终止后 - 否则,如果没有任何内容可读取,它将阻塞(readline() 包含换行符,因此在空行上,它返回“\n”)。这避免了循环后需要进行尴尬的最终
communicate()
调用。对于行数很长的文件,
read()
可能更适合减少最大内存使用量 - 传递给它的数字是任意的,但排除它会导致立即读取整个管道输出,这可能是不可取的。I believe the simplest way to collect output from a process in a streaming fashion is like this:
The
readline()
orread()
function should only return an empty string on EOF, after the process has terminated - otherwise it will block if there is nothing to read (readline()
includes the newline, so on empty lines, it returns "\n"). This avoids the need for an awkward finalcommunicate()
call after the loop.On files with very long lines
read()
may be preferable to reduce maximum memory usage - the number passed to it is arbitrary, but excluding it results in reading the entire pipe output at once which is probably not desirable.如果您想要非阻塞方法,请不要使用
process.communicate()
。如果将subprocess.Popen()
参数stdout
设置为PIPE
,则可以从process.stdout
读取数据并使用 process.poll() 检查进程是否仍在运行。If you want a non-blocking approach, don't use
process.communicate()
. If you set thesubprocess.Popen()
argumentstdout
toPIPE
, you can read fromprocess.stdout
and check if the process still runs usingprocess.poll()
.如果您只是想实时传递输出,那么很难比这更简单:
请参阅 subprocess.check_call() 的文档。
如果您需要处理输出,当然,请对其进行循环。但如果你不这样做,那就保持简单。
编辑: JF Sebastian 指出 stdout 和 stderr 参数的默认值都通过到 sys.stdout 和 sys.stderr,如果 sys.stdout 和 sys.stderr 已被替换(例如,用于捕获测试中的输出),这将失败。
If you're simply trying to pass the output through in realtime, it's hard to get simpler than this:
See the docs for subprocess.check_call().
If you need to process the output, sure, loop on it. But if you don't, just keep it simple.
Edit: J.F. Sebastian points out both that the defaults for the stdout and stderr parameters pass through to sys.stdout and sys.stderr, and that this will fail if sys.stdout and sys.stderr have been replaced (say, for capturing output in tests).
添加另一个带有一些小更改的 python3 解决方案:
with
构造时无法获取退出代码)Adding another python3 solution with a few small changes:
with
construct)