根本不理解 dd 命令参数

发布于 2024-11-25 10:07:24 字数 934 浏览 3 评论 0原文

我对 dd 命令比较熟悉，但我自己很少需要使用它。今天我需要这样做，但我遇到了看起来很奇怪的行为。

我想创建一个100M的文本文件，其中每一行都包含一个单词“testing”。这是我的第一次尝试：

~$ perl -e 'print "testing\n" while 1' | dd of=X bs=1M count=100
0+100 records in
0+100 records out
561152 bytes (561 kB) copied, 0.00416429 s, 135 MB/s

嗯，这很奇怪。其他组合又如何呢？

~$ perl -e 'print "testing\n" while 1' | dd of=X bs=100K count=1K
0+1024 records in
0+1024 records out
4268032 bytes (4.3 MB) copied, 0.0353145 s, 121 MB/s

~$ perl -e 'print "testing\n" while 1' | dd of=X bs=10K count=10K
86+10154 records in
86+10154 records out
42524672 bytes (43 MB) copied, 0.35403 s, 120 MB/s

~$ perl -e 'print "testing\n" while 1' | dd of=X bs=1K count=100K
102400+0 records in
102400+0 records out
104857600 bytes (105 MB) copied, 0.879549 s, 119 MB/s

因此，在这四个表面上等效的命令中，所有命令都会生成不同大小的文件，其中只有一个是我期望的文件。这是为什么？

编辑：顺便说一句，我有点尴尬，我没有想到“是的测试”而不是那个更长的 Perl 命令。

原文

I'm passingly familiar with the dd command, but I've rarely had the need to use it myself. Today I need to, but I'm encountering behavior that seems really weird.

I want to create a 100M text file, each line of which contains the single word "testing." This was my first try:

~$ perl -e 'print "testing\n" while 1' | dd of=X bs=1M count=100
0+100 records in
0+100 records out
561152 bytes (561 kB) copied, 0.00416429 s, 135 MB/s

Hmm, that's odd. What about other combinations?

~$ perl -e 'print "testing\n" while 1' | dd of=X bs=100K count=1K
0+1024 records in
0+1024 records out
4268032 bytes (4.3 MB) copied, 0.0353145 s, 121 MB/s

~$ perl -e 'print "testing\n" while 1' | dd of=X bs=10K count=10K
86+10154 records in
86+10154 records out
42524672 bytes (43 MB) copied, 0.35403 s, 120 MB/s

~$ perl -e 'print "testing\n" while 1' | dd of=X bs=1K count=100K
102400+0 records in
102400+0 records out
104857600 bytes (105 MB) copied, 0.879549 s, 119 MB/s

So of these four apparently-equivalent commands, all produce files of different sizes, only one of which is the one I would expect. Why is that?

EDIT: By the by, I'm a little embarrassed I didn't think of "yes testing" instead of that longer Perl command.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

缪败 2024-12-02 10:07:24

要了解发生了什么，让我们看一下类似调用的 strace 的输出：

execve("/bin/dd", ["dd", "of=X", "bs=1M", "count=2"], [/* 72 vars */]) = 0
…
read(0, "testing\ntesting\ntesting\ntesting\n"..., 1048576) = 69632
write(1, "testing\ntesting\ntesting\ntesting\n"..., 69632) = 69632
read(0, "testing\ntesting\ntesting\ntesting\n"..., 1048576) = 8192
write(1, "testing\ntesting\ntesting\ntesting\n"..., 8192) = 8192
close(0)                                = 0
close(1)                                = 0
write(2, "0+2 records in\n0+2 records out\n", 31) = 31
write(2, "77824 bytes (78 kB) copied", 26) = 26
write(2, ", 0.000505796 s, 154 MB/s\n", 26) = 26
…

发生的情况是 dd 生成单个 read()< /code> 调用来读取每个块。这适用于从磁带读取数据，而这正是 dd 最初的主要用途。在磁带上，read 实际上读取一个块。从文件读取时，必须小心不要指定太大的块大小，否则读取将被截断。从管道读取时，情况更糟：读取的块的大小将取决于生成数据的命令的速度。

这个故事的寓意是不要使用 dd 来复制数据，除非是安全的小块。除了 bs=1 之外，绝不会来自管道。

（GNU dd 有一个 fullblock 标志来告诉它行为得体。但其他实现则不然。）

To see what's going on, let's look at the output of strace for a similar invocation:

execve("/bin/dd", ["dd", "of=X", "bs=1M", "count=2"], [/* 72 vars */]) = 0
…
read(0, "testing\ntesting\ntesting\ntesting\n"..., 1048576) = 69632
write(1, "testing\ntesting\ntesting\ntesting\n"..., 69632) = 69632
read(0, "testing\ntesting\ntesting\ntesting\n"..., 1048576) = 8192
write(1, "testing\ntesting\ntesting\ntesting\n"..., 8192) = 8192
close(0)                                = 0
close(1)                                = 0
write(2, "0+2 records in\n0+2 records out\n", 31) = 31
write(2, "77824 bytes (78 kB) copied", 26) = 26
write(2, ", 0.000505796 s, 154 MB/s\n", 26) = 26
…

What happens is that dd makes a single read() call to read each block. This is appropriate when reading from a tape, which is what dd was originally mainly used for. On tapes, read really reads a block. When reading from a file, you have to be careful not to specify a too large block size, or else the read will be truncated. When reading from a pipe, it's worse: the size of the block that you read will depend on the speed of the command producing the data.

The moral of the story is not to use dd to copy data, except with safe, small blocks. And never from a pipe except with bs=1.

(GNU dd has a fullblock flag to tell it to behave decently. But other implementations don't.)

回复收藏 0 原文

夜深人未静 2024-12-02 10:07:24

我还不确定为什么，但使用此方法不会在保存之前填满整个块。尝试：

perl -e 'print "testing\n" while 1' | dd of=output.txt bs=10K count=10K iflag=fullblock
10240+0 records in
10240+0 records out
104857600 bytes (105 MB) copied, 2.79572 s, 37.5 MB/s

iflag=fullblock 似乎强制 dd 累积输入直到块已满，尽管我不确定为什么这不是默认值，或者默认情况下它实际上做了什么。

I'm not yet sure why, but using this method will not fill up an entire block before saving it. Try:

perl -e 'print "testing\n" while 1' | dd of=output.txt bs=10K count=10K iflag=fullblock
10240+0 records in
10240+0 records out
104857600 bytes (105 MB) copied, 2.79572 s, 37.5 MB/s

The iflag=fullblock seems to force dd to accumulate input until the block is full, although I'm not sure why this is not the default, or what it actually does by default.

回复收藏 0 原文

折戟 2024-12-02 10:07:24

我最好的猜测是 dd 从管道中读取，当它为空时，它假设它读取了整个块。结果很不一致：

$ perl -e 'print "testing\n" while 1' | dd of=X bs=1M count=100
0+100 records in
0+100 records out
413696 bytes (414 kB) copied, 0.0497362 s, 8.3 MB/s

user@andromeda ~
$ perl -e 'print "testing\n" while 1' | dd of=X bs=1M count=100
0+100 records in
0+100 records out
409600 bytes (410 kB) copied, 0.0484852 s, 8.4 MB/s

My best guess is that dd reads from the pipe and when it's empty it assumes that it read the whole block. The results are quite inconsistent:

$ perl -e 'print "testing\n" while 1' | dd of=X bs=1M count=100
0+100 records in
0+100 records out
413696 bytes (414 kB) copied, 0.0497362 s, 8.3 MB/s

user@andromeda ~
$ perl -e 'print "testing\n" while 1' | dd of=X bs=1M count=100
0+100 records in
0+100 records out
409600 bytes (410 kB) copied, 0.0484852 s, 8.4 MB/s

回复收藏 0 原文

~没有更多了~