如何在 Bash 中区分两个管道?
在 Bash 中,如何在不使用临时文件的情况下比较两个管道? 假设您有两个命令管道:
foo | bar
baz | quux
并且您希望在它们的输出中找到差异。 显然,一种解决方案是:
foo | bar > /tmp/a
baz | quux > /tmp/b
diff /tmp/a /tmp/b
是否可以在 Bash 中不使用临时文件的情况下实现此目的? 您可以通过将其中一个管道输送到 diff 来删除一个临时文件:
foo | bar > /tmp/a
baz | quux | diff /tmp/a -
但是您不能同时将两个管道输送到 diff 中(至少不能以任何明显的方式)。 是否有一些涉及 /dev/fd
的巧妙技巧可以在不使用临时文件的情况下执行此操作?
How can you diff two pipelines without using temporary files in Bash? Say you have two command pipelines:
foo | bar
baz | quux
And you want to find the diff
in their outputs. One solution would obviously be to:
foo | bar > /tmp/a
baz | quux > /tmp/b
diff /tmp/a /tmp/b
Is it possible to do so without the use of temporary files in Bash? You can get rid of one temporary file by piping in one of the pipelines to diff:
foo | bar > /tmp/a
baz | quux | diff /tmp/a -
But you can't pipe both pipelines into diff simultaneously (not in any obvious manner, at least). Is there some clever trick involving /dev/fd
to do this without using temporary files?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
包含 2 个 tmp 文件(不是您想要的)的一行将是:
使用 bash,您可以尝试:
第二个版本将通过显示
更清楚地提醒您哪个输入是哪个输入
-- /dev/stdin
与++ /dev/fd/63
之类的,而不是两个编号的 fd。甚至命名管道也不会出现在文件系统中,至少在 bash 可以通过使用像
/dev/fd/63
这样的文件名来获取命令可以打开和读取的文件名来实现进程替换的操作系统上。实际读取 bash 在执行命令之前设置的已打开的文件描述符。 (即 bash 在 fork 之前使用pipe(2)
,然后使用dup2
从quux
的输出重定向到的输入文件描述符>diff
,在 fd 63 上。)在没有“神奇”
/dev/fd
或/proc/self/fd
的系统上,bash 可能会使用命名的管道来实现进程替换,但与临时文件不同,它至少会自行管理它们,并且您的数据不会写入文件系统。您可以检查 bash 如何使用
echo <(true)
实现进程替换来打印文件名而不是读取文件名。 它在典型的 Linux 系统上打印/dev/fd/63
。 或者有关 bash 使用的系统调用的更多详细信息,Linux 系统上的此命令将跟踪文件和文件描述符系统调用没有 bash,您可以创建命名管道。 使用
-
告诉diff
从 STDIN 读取一个输入,并使用命名管道作为另一个输入:请注意,您只能通过管道传输一个输出使用 tee 命令进行多个输入:
上述命令将 ls *.txt 的输出显示到终端,并将其输出到文本文件 txtlist.txt。
但通过流程替换,您可以使用
tee
将相同的数据输入多个管道:A one-line with 2 tmp files (not what you want) would be:
With bash, you might try though:
The 2nd version will more clearly remind you which input was which, by showing
-- /dev/stdin
vs.++ /dev/fd/63
or something, instead of two numbered fds.Not even a named pipe will appear in the filesystem, at least on OSes where bash can implement process substitution by using filenames like
/dev/fd/63
to get a filename that the command can open and read from to actually read from an already-open file descriptor that bash set up before exec'ing the command. (i.e. bash usespipe(2)
before fork, and thendup2
to redirect from the output ofquux
to an input file descriptor fordiff
, on fd 63.)On a system with no "magical"
/dev/fd
or/proc/self/fd
, bash might use named pipes to implement process substitution, but it would at least manage them itself, unlike temporary files, and your data wouldn't be written to the filesystem.You can check how bash implements process substitution with
echo <(true)
to print the filename instead of reading from it. It prints/dev/fd/63
on a typical Linux system. Or for more details on exactly what system calls bash uses, this command on a Linux system will trace file and file-descriptor system callsWithout bash, you could make a named pipe. Use
-
to telldiff
to read one input from STDIN, and use the named pipe as the other:Note that you can only pipe one output to multiple inputs with the tee command:
The above command displays the output of ls *.txt to the terminal and outputs it to the text file txtlist.txt.
But with process substitution, you can use
tee
to feed the same data into multiple pipelines:在 bash 中,您可以使用子 shell,通过将管道括在括号内来单独执行命令管道。 然后您可以在它们前面加上 < 前缀。 创建匿名命名管道,然后可以将其传递给 diff。
例如:
匿名命名管道由 bash 管理,因此它们会自动创建和销毁(与临时文件不同)。
In bash you can use subshells, to execute the command pipelines individually, by enclosing the pipeline within parenthesis. You can then prefix these with < to create anonymous named pipes which you can then pass to diff.
For example:
The anonymous named pipes are managed by bash so they are created and destroyed automatically (unlike temporary files).
到达此页面的某些人可能正在寻找逐行差异,应使用
comm
或grep -f
来代替。需要指出的一件事是,在所有答案的示例中,直到两个流都完成后,差异才会真正开始。 例如测试一下:
如果这是一个问题,您可以尝试 sd (流差异),这不会不需要排序(如
comm
),也不需要像上面的示例那样进行替换,比grep -f
快几个数量级,并且支持无限流。我建议的测试示例将在 sd 中这样编写:
但不同之处在于 seq 100 会立即与 seq 10 进行比较。 请注意,如果其中一个流是
tail -f
,则无法通过进程替换来完成差异。这是我写的博客文章在终端上比较流,这引入了 sd。
Some people arriving at this page might be looking for a line-by-line diff, for which
comm
orgrep -f
should be used instead.One thing to point out is that, in all of the answer's examples, the diffs won't actually start until both streams have finished. Test this with e.g.:
If this is an issue, you could try sd (stream diff), which doesn't require sorting (like
comm
does) nor process substitution like the above examples, is orders or magnitude faster thangrep -f
and supports infinite streams.The test example I propose would be written like this in
sd
:But the difference is that
seq 100
would be diffed withseq 10
right away. Note that, if one of the streams is atail -f
, the diff cannot be done with process substitution.Here's a blogpost I wrote about diffing streams on the terminal, which introduces
sd
.