如何在 Bash 中区分两个管道？

发布于 2024-07-09 06:30:35 字数 543 浏览 7 评论 0原文

在 Bash 中，如何在不使用临时文件的情况下比较两个管道？假设您有两个命令管道：

foo | bar
baz | quux

并且您希望在它们的输出中找到差异。显然，一种解决方案是：

foo | bar > /tmp/a
baz | quux > /tmp/b
diff /tmp/a /tmp/b

是否可以在 Bash 中不使用临时文件的情况下实现此目的？您可以通过将其中一个管道输送到 diff 来删除一个临时文件：

foo | bar > /tmp/a
baz | quux | diff /tmp/a -

但是您不能同时将两个管道输送到 diff 中（至少不能以任何明显的方式）。是否有一些涉及 /dev/fd 的巧妙技巧可以在不使用临时文件的情况下执行此操作？

原文

How can you diff two pipelines without using temporary files in Bash? Say you have two command pipelines:

foo | bar
baz | quux

And you want to find the diff in their outputs. One solution would obviously be to:

foo | bar > /tmp/a
baz | quux > /tmp/b
diff /tmp/a /tmp/b

Is it possible to do so without the use of temporary files in Bash? You can get rid of one temporary file by piping in one of the pipelines to diff:

foo | bar > /tmp/a
baz | quux | diff /tmp/a -

But you can't pipe both pipelines into diff simultaneously (not in any obvious manner, at least). Is there some clever trick involving /dev/fd to do this without using temporary files?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

等待我真够勒 2024-07-16 06:30:35

包含 2 个 tmp 文件（不是您想要的）的一行将是：

 foo | bar > file1.txt && baz | quux > file2.txt && diff file1.txt file2.txt

使用 bash，您可以尝试：

 diff <(foo | bar) <(baz | quux)

 foo | bar | diff - <(baz | quux)  # or only use process substitution once

第二个版本将通过显示
更清楚地提醒您哪个输入是哪个输入
-- /dev/stdin 与 ++ /dev/fd/63 之类的，而不是两个编号的 fd。

甚至命名管道也不会出现在文件系统中，至少在 bash 可以通过使用像 /dev/fd/63 这样的文件名来获取命令可以打开和读取的文件名来实现进程替换的操作系统上。实际读取 bash 在执行命令之前设置的已打开的文件描述符。（即 bash 在 fork 之前使用 pipe(2)，然后使用 dup2 从 quux 的输出重定向到 的输入文件描述符>diff，在 fd 63 上。）

在没有“神奇”/dev/fd 或 /proc/self/fd 的系统上，bash 可能会使用命名的管道来实现进程替换，但与临时文件不同，它至少会自行管理它们，并且您的数据不会写入文件系统。

您可以检查 bash 如何使用 echo <(true) 实现进程替换来打印文件名而不是读取文件名。它在典型的 Linux 系统上打印 /dev/fd/63。或者有关 bash 使用的系统调用的更多详细信息，Linux 系统上的此命令将跟踪文件和文件描述符系统调用

strace -f -efile,desc,clone,execve bash -c '/bin/true | diff -u - <(/bin/true)'

没有 bash，您可以创建命名管道。使用 - 告诉 diff 从 STDIN 读取一个输入，并使用命名管道作为另一个输入：

mkfifo file1_pipe.txt
foo|bar > file1_pipe.txt && baz | quux | diff file1_pipe.txt - && rm file1_pipe.txt

请注意，您只能通过管道传输一个输出使用 tee 命令进行多个输入：

ls *.txt | tee /dev/tty txtlist.txt

上述命令将 ls *.txt 的输出显示到终端，并将其输出到文本文件 txtlist.txt。

但通过流程替换，您可以使用 tee 将相同的数据输入多个管道：

cat *.txt | tee >(foo | bar > result1.txt)  >(baz | quux > result2.txt) | foobar

A one-line with 2 tmp files (not what you want) would be:

 foo | bar > file1.txt && baz | quux > file2.txt && diff file1.txt file2.txt

With bash, you might try though:

 diff <(foo | bar) <(baz | quux)

 foo | bar | diff - <(baz | quux)  # or only use process substitution once

The 2nd version will more clearly remind you which input was which, by showing
-- /dev/stdin vs. ++ /dev/fd/63 or something, instead of two numbered fds.

Not even a named pipe will appear in the filesystem, at least on OSes where bash can implement process substitution by using filenames like /dev/fd/63 to get a filename that the command can open and read from to actually read from an already-open file descriptor that bash set up before exec'ing the command. (i.e. bash uses pipe(2) before fork, and then dup2 to redirect from the output of quux to an input file descriptor for diff, on fd 63.)

On a system with no "magical" /dev/fd or /proc/self/fd, bash might use named pipes to implement process substitution, but it would at least manage them itself, unlike temporary files, and your data wouldn't be written to the filesystem.

You can check how bash implements process substitution with echo <(true) to print the filename instead of reading from it. It prints /dev/fd/63 on a typical Linux system. Or for more details on exactly what system calls bash uses, this command on a Linux system will trace file and file-descriptor system calls

strace -f -efile,desc,clone,execve bash -c '/bin/true | diff -u - <(/bin/true)'

Without bash, you could make a named pipe. Use - to tell diff to read one input from STDIN, and use the named pipe as the other:

mkfifo file1_pipe.txt
foo|bar > file1_pipe.txt && baz | quux | diff file1_pipe.txt - && rm file1_pipe.txt

Note that you can only pipe one output to multiple inputs with the tee command:

ls *.txt | tee /dev/tty txtlist.txt

The above command displays the output of ls *.txt to the terminal and outputs it to the text file txtlist.txt.

But with process substitution, you can use tee to feed the same data into multiple pipelines:

cat *.txt | tee >(foo | bar > result1.txt)  >(baz | quux > result2.txt) | foobar

回复收藏 0 原文

雨的味道风的声音 2024-07-16 06:30:35

在 bash 中，您可以使用子 shell，通过将管道括在括号内来单独执行命令管道。然后您可以在它们前面加上 < 前缀。创建匿名命名管道，然后可以将其传递给 diff。

例如：

diff <(foo | bar) <(baz | quux)

匿名命名管道由 bash 管理，因此它们会自动创建和销毁（与临时文件不同）。

In bash you can use subshells, to execute the command pipelines individually, by enclosing the pipeline within parenthesis. You can then prefix these with < to create anonymous named pipes which you can then pass to diff.

For example:

diff <(foo | bar) <(baz | quux)

The anonymous named pipes are managed by bash so they are created and destroyed automatically (unlike temporary files).

回复收藏 0 原文

暗喜 2024-07-16 06:30:35

到达此页面的某些人可能正在寻找逐行差异，应使用 comm 或 grep -f 来代替。

需要指出的一件事是，在所有答案的示例中，直到两个流都完成后，差异才会真正开始。例如测试一下：

comm -23 <(seq 100 | sort) <(seq 10 20 && sleep 5 && seq 20 30 | sort)

如果这是一个问题，您可以尝试 sd （流差异），这不会不需要排序（如comm），也不需要像上面的示例那样进行替换，比grep -f快几个数量级，并且支持无限流。

我建议的测试示例将在 sd 中这样编写：

seq 100 | sd 'seq 10 20 && sleep 5 && seq 20 30'

但不同之处在于 seq 100 会立即与 seq 10 进行比较。请注意，如果其中一个流是 tail -f，则无法通过进程替换来完成差异。

这是我写的博客文章在终端上比较流，这引入了 sd。

Some people arriving at this page might be looking for a line-by-line diff, for which comm or grep -f should be used instead.

One thing to point out is that, in all of the answer's examples, the diffs won't actually start until both streams have finished. Test this with e.g.:

comm -23 <(seq 100 | sort) <(seq 10 20 && sleep 5 && seq 20 30 | sort)

If this is an issue, you could try sd (stream diff), which doesn't require sorting (like comm does) nor process substitution like the above examples, is orders or magnitude faster than grep -f and supports infinite streams.

The test example I propose would be written like this in sd:

seq 100 | sd 'seq 10 20 && sleep 5 && seq 20 30'

But the difference is that seq 100 would be diffed with seq 10 right away. Note that, if one of the streams is a tail -f, the diff cannot be done with process substitution.

Here's a blogpost I wrote about diffing streams on the terminal, which introduces sd.

回复收藏 0 原文

~没有更多了~