python 子进程:“写入错误:管道损坏”

发布于 2024-09-30 18:20:22 字数 699 浏览 3 评论 0原文

我在管道传输简单的 subprocess.Popen 时遇到问题。

代码:

import subprocess
cmd = 'cat file | sort -g -k3 | head -20 | cut -f2,3' % (pattern,file)
p = subprocess.Popen(cmd,shell=True,stdout=subprocess.PIPE)
for line in p.stdout:
    print(line.decode().strip())

长度约为 1000 行的文件输出:

...
sort: write failed: standard output: Broken pipe
sort: write error

长度大于 241 行的文件输出:

...
sort: fflush failed: standard output: Broken pipe
sort: write error

长度小于 241 行的文件输出很好。

我一直在疯狂地阅读文档和谷歌搜索,但是我缺少了一些关于子进程模块的基本内容......可能与缓冲区有关。我尝试过 p.stdout.flush() 并使用缓冲区大小和 p.wait() 。我尝试使用“sleep 20;”等命令重现此问题cat modefile' 但这似乎运行没有错误。

I have a problem piping a simple subprocess.Popen.

Code:

import subprocess
cmd = 'cat file | sort -g -k3 | head -20 | cut -f2,3' % (pattern,file)
p = subprocess.Popen(cmd,shell=True,stdout=subprocess.PIPE)
for line in p.stdout:
    print(line.decode().strip())

Output for file ~1000 lines in length:

...
sort: write failed: standard output: Broken pipe
sort: write error

Output for file >241 lines in length:

...
sort: fflush failed: standard output: Broken pipe
sort: write error

Output for file <241 lines in length is fine.

I have been reading the docs and googling like mad but there is something fundamental about the subprocess module that I'm missing ... maybe to do with buffers. I've tried p.stdout.flush() and playing with the buffer size and p.wait(). I've tried to reproduce this with commands like 'sleep 20; cat moderatefile' but this seems to run without error.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

何必那么矫情 2024-10-07 18:20:22

subprocess 文档中的食谱:

# To replace shell pipeline like output=`dmesg | grep hda`
p1 = Popen(["dmesg"], stdout=PIPE)
p2 = Popen(["grep", "hda"], stdin=p1.stdout, stdout=PIPE)
output = p2.communicate()[0]

From the recipes on subprocess docs:

# To replace shell pipeline like output=`dmesg | grep hda`
p1 = Popen(["dmesg"], stdout=PIPE)
p2 = Popen(["grep", "hda"], stdin=p1.stdout, stdout=PIPE)
output = p2.communicate()[0]
酒与心事 2024-10-07 18:20:22

这是因为你不应该在传递给 subprocess.Popen 的命令中使用“shell Pipes”,你应该像这样使用 subprocess.PIPE

from subprocess import Popen, PIPE

p1 = Popen('cat file', stdout=PIPE)
p2 = Popen('sort -g -k 3', stdin=p1.stdout, stdout=PIPE)
p3 = Popen('head -20', stdin=p2.stdout, stdout=PIPE)
p4 = Popen('cut -f2,3', stdin=p3.stdout)
final_output = p4.stdout.read()

但我不得不说你想要做的事情可以在纯Python中完成,而不是调用一堆shell命令。

This is because you shouldn't use "shell pipes" in the command passed to subprocess.Popen, you should use the subprocess.PIPE like this:

from subprocess import Popen, PIPE

p1 = Popen('cat file', stdout=PIPE)
p2 = Popen('sort -g -k 3', stdin=p1.stdout, stdout=PIPE)
p3 = Popen('head -20', stdin=p2.stdout, stdout=PIPE)
p4 = Popen('cut -f2,3', stdin=p3.stdout)
final_output = p4.stdout.read()

But i have to say that what you're trying to do could be done in pure python instead of calling a bunch of shell commands.

任性一次 2024-10-07 18:20:22

我也遇到了同样的错误。甚至将管道放入 bash 脚本中并执行它,而不是 Python 中的管道。从 Python 中,它会得到损坏的管道错误,从 bash 中则不会。

在我看来,也许 head 之前的最后一个命令会抛出错误,因为它的(排序)STDOUT 已关闭。 Python 一定会注意到这一点,而使用 shell 时错误是无声的。我更改了代码以消耗整个输入,并且错误消失了。

对于较小的文件也有意义,因为管道可能会在头退出之前缓冲整个输出。这可以解释较大文件的中断。

例如,我不是使用“head -1”(在我的情况下,我只想要第一行),而是执行了 awk 'NR == 1'

可能有更好的方法来执行此操作,具体取决于“head -X”的位置' 发生在管道中。

I have been having the same error. Even put the pipe in a bash script and executed that instead of the pipe in Python. From Python it would get the broken pipe error, from bash it wouldn't.

It seems to me that perhaps the last command prior to the head is throwing an error as it's (the sort) STDOUT is closed. Python must be picking up on this whereas with the shell the error is silent. I've changed my code to consume the entire input and the error went away.

Would make sense also with smaller files working as the pipe probably buffers the entire output before head exits. This would explain the breaks on larger files.

e.g., instead of a 'head -1' (in my case, I was only wanting the first line), I did an awk 'NR == 1'

There are probably better ways of doing this depending on where the 'head -X' occurs in the pipe.

大海や 2024-10-07 18:20:22

您不需要 shell=True。不要调用 shell。这就是我会这样做的:

p = subprocess.Popen(cmd, stdout=subprocess.PIPE)
stdout_value = p.communicate()[0] 
stdout_value   # the output

看看使用它后您是否遇到有关缓冲区的问题?

You don't need shell=True. Don't invoke the shell. This is how I would do it:

p = subprocess.Popen(cmd, stdout=subprocess.PIPE)
stdout_value = p.communicate()[0] 
stdout_value   # the output

See if you face the problem about the buffer after using this?

妖妓 2024-10-07 18:20:22

尝试使用 communicate(),而不是直接从标准输出读取。

python 文档是这样说的:

“警告使用communicate()而不是
.stdin.write、.stdout.read 或
.stderr.read 以避免由于以下原因导致死锁
任何其他操作系统管道缓冲区
填充并阻挡孩子
过程。”

http://docs.python.org/library/subprocess. html#subprocess.Popen.stdout

p = subprocess.Popen(cmd, stdout=subprocess.PIPE)
output =  p.communicate[0]
for line in output:
    # do stuff

try using communicate(), rather than reading directly from stdout.

the python docs say this:

"Warning Use communicate() rather than
.stdin.write, .stdout.read or
.stderr.read to avoid deadlocks due to
any of the other OS pipe buffers
filling up and blocking the child
process."

http://docs.python.org/library/subprocess.html#subprocess.Popen.stdout

p = subprocess.Popen(cmd, stdout=subprocess.PIPE)
output =  p.communicate[0]
for line in output:
    # do stuff
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文