了解如何在 Python 中使用管道和子进程？

发布于 2024-09-14 23:48:29 字数 1315 浏览 5 评论 0原文

我正在研究子进程和管道背后的概念，并在 Python 上下文中使用它们。如果有人能阐明这些问题，那对我确实有帮助。

假设我的管道设置如下
<代码>createText.py |进程文本.py |猫
processText.py正在通过stdin接收数据，但是这是如何实现的呢？它如何知道不会再有数据到来并且应该退出？我的猜测是，它可以查找 EOF 并基于此终止，但如果 createText.py 从不发送 EOF 呢？这会被认为是 createText.py 方面的错误吗？
假设parent.py启动一个子进程（child.py）并调用wait()来等待子进程完成。如果父级将子级的 stdout 和 stderr 捕获为管道，那么在子级终止后从它们读取是否仍然安全？或者，当一端终止时，管道（以及其中的数据）是否会被销毁？

我想要解决的一般问题是编写一个使用 Popen 类多次调用 rsync 的 python 脚本。我希望我的程序等到 rsync 完成，然后我想检查返回状态以查看它是否正确退出。如果没有，我想读取孩子的 stderr 以查看错误是什么。这是我到目前为止所拥有的

# 进行 rsync 调用。会阻塞直到子进程
# 进程结束。返回 rsync 的退出代码
def PerformRsync(src, dest):
    打印“推”+ src +“到”+ dest
    child = Popen(['rsync', '-av', src, dest], shell=False, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
    child.wait()    
    ## 检查成功或失败
    ## 这里 0 是成功退出代码
    如果不是child.returncode：
        返回真 
    其他：#ballz
        粗壮，sterr = child.communicate()
        打印“ERR Push” + src + “。” + sterr
        返回错误

更新：我也遇到了这个问题。考虑这两个简单的文件：
<前><代码># createText.py 对于范围 (1000) 内的 x： print "创建行" + str(x) 时间.睡眠(1) # 进程文本.py 而真实：行= sys.stdin.readline() 如果没有线：休息; 打印“我修改了”+行
为什么在这种情况下 processText.py 在从 stdin 获取数据时不开始打印？管道在传递数据之前是否会收集一定量的缓冲数据？

原文

I'm wrestling with the concepts behind subprocesses and pipes, and working with them in a Python context. If anybody could shed some light on these questions it would really help me out.

Say I have a pipeline set up as follows
createText.py | processText.py | cat
processText.py is receiving data through stdin, but how is this implemented? How does it know that no more data will be coming and that it should exit? My guess is that it could look for an EOF and terminate based on that, but what if createText.py never sends one? Would that be considered an error on createText.py's part?
Say parent.py starts a child subprocess (child.py) and calls wait() to wait for the child to complete. If parent is capturing child's stdout and stderr as pipes, is it still safe to read from them after child has terminated? Or are the pipes (and data in them) destroyed when one end terminates?

The general problem that I want to solve is to write a python script that calls rsync several times with the Popen class. I want my program to wait until rsync has completed, then I want to check the return status to see if it exited correctly. If it didn't, I want to read the child's stderr to see what the error was. Here is what I have so far

# makes the rsync call.  Will block until the child
# process is finished.  Returns the exit code for rsync
def performRsync(src, dest):
    print "Pushing " + src + " to " + dest
    child = Popen(['rsync', '-av', src, dest], shell=False, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
    child.wait()    
    ## check for success or failure
    ## 0 is a successful exit code here
    if not child.returncode:
        return True 
    else:#ballz
        stout, sterr = child.communicate()
        print "ERR pushing " + src + ". " + sterr
        return False

Update: I also came across this problem. Consider these two simple files:
```
# createText.py
for x in range(1000):
    print "creating line " + str(x)
    time.sleep(1)

# processText.py
while True:
    line = sys.stdin.readline()
    if not line:
        break;
    print "I modified " + line
```
Why does processText.py in this case not start printing as it gets data from stdin? Does a pipe collect some amount of buffered data before it passes it along?