如何制作拉链炸弹?

发布于 2024-08-05 15:22:04 字数 381 浏览 8 评论 0原文

这个关于 zip 炸弹的问题自然而然地让我想到了有关该主题的维基百科页面。本文提到了一个 45.1 kb zip 文件的示例,该文件解压缩为 1.3 艾字节。

首先创建此类文件的原则/技术是什么?我实际上不想这样做,更感兴趣的是对所涉及概念的简化“东西如何工作”的解释。

文章提到了 9 层 zip 文件,因此这不是压缩一堆零的简单情况。为什么是 9 个,为什么每个是 10 个文件?

This question about zip bombs naturally led me to the Wikipedia page on the topic. The article mentions an example of a 45.1 kb zip file that decompresses to 1.3 exabytes.

What are the principles/techniques that would be used to create such a file in the first place? I don't want to actually do this, more interested in a simplified "how-stuff-works" explanation of the concepts involved.

The article mentions 9 layers of zip files, so it's not a simple case of zipping a bunch of zeros. Why 9, why 10 files in each?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(15

梦一生花开无言 2024-08-12 15:22:04

引用维基百科页面:

Zip 炸弹的一个例子是该文件
45.1.zip 是 45.1 KB 的压缩数据,包含 9 个
嵌套 zip 文件的层数
10、各底层存档
包含一个 1.30 GB 的文件

总共 1.30 艾字节未压缩
数据。

因此,您所需要的只是一个全是零的 1.3GB 文件,将其压缩为 ZIP 文件,制作 10 个副本,将其打包为 ZIP 文件,然后重复此过程 9 次。

这样,您将获得一个文件,当完全解压缩时,该文件会产生大量数据,而无需您从该数据量开始。

此外,嵌套的档案使得病毒扫描程序(这些“炸弹”的主要目标)等程序更难聪明地拒绝解压“太大”的档案,因为直到最后一层,数据总量才没有那么多,在达到最低级别之前,您不会“看到”最低级别的文件有多大,并且每个单独的文件都不是“太大” - 只有巨大的数量才有问题。

Citing from the Wikipedia page:

One example of a Zip bomb is the file
45.1.zip which was 45.1 kilobytes of compressed data, containing nine
layers of nested zip files in sets of
10, each bottom layer archive
containing a 1.30 gigabyte file
for a
total of 1.30 exabytes of uncompressed
data.

So all you need is one single 1.3GB file full of zeroes, compress that into a ZIP file, make 10 copies, pack those into a ZIP file, and repeat this process 9 times.

This way, you get a file which, when uncompressed completely, produces an absurd amount of data without requiring you to start out with that amount.

Additionally, the nested archives make it much harder for programs like virus scanners (the main target of these "bombs") to be smart and refuse to unpack archives that are "too large", because until the last level the total amount of data is not that much, you don't "see" how large the files at the lowest level are until you have reached that level, and each individual file is not "too large" - only the huge number is problematic.

一紙繁鸢 2024-08-12 15:22:04

创建一个 1.3 艾字节的零文件。

右键单击>发送到压缩(zipped)文件夹。

Create a 1.3 exabyte file of zeros.

Right click > Send to compressed (zipped) folder.

对不⑦ 2024-08-12 15:22:04

在 Linux 下使用以下命令可以轻松完成此操作:

dd if=/dev/zero bs=1024 count=10000 | zip zipbomb.zip -

将 count 替换为您要压缩的 KB 数。上面的示例创建了一个 10MiB 的 zip 炸弹(根本不是一个炸弹,但它显示了过程)。

您不需要硬盘空间来存储所有未压缩的数据。

This is easily done under Linux using the following command:

dd if=/dev/zero bs=1024 count=10000 | zip zipbomb.zip -

Replace count with the number of KB you want to compress. The example above creates a 10MiB zip bomb (not much of a bomb at all, but it shows the process).

You DO NOT need hard disk space to store all the uncompressed data.

为人所爱 2024-08-12 15:22:04

以下适用于 Windows:

来自安全焦点概念证明(NSFW! ),它是一个包含 16 个文件夹的 ZIP 文件,每个文件夹又包含 16 个文件夹,如下所示(42 是 zip 文件名):

\42\lib 0\book 0\chapter 0\doc 0\0.dll
...
\42\lib F\book F\章节 F\doc F\0.dll

我可能对这个数字有误,但它会生成 4^16 (4,294,967,296) 个目录。由于每个目录需要分配N个字节的空间,所以最终会变得很大。最后的dll文件是0字节。

单独解压第一个目录 \42\lib 0\book 0\chapter 0\doc 0\0.dll 会产生 4GB 的分配空间。

Below is for Windows:

From the Security Focus proof of concept (NSFW!), it's a ZIP file with 16 folders, each with 16 folders, which goes on like so (42 is the zip file name):

\42\lib 0\book 0\chapter 0\doc 0\0.dll
...
\42\lib F\book F\chapter F\doc F\0.dll

I'm probably wrong with this figure, but it produces 4^16 (4,294,967,296) directories. Because each directory needs allocation space of N bytes, it ends up being huge. The dll file at the end is 0 bytes.

Unzipped the first directory alone \42\lib 0\book 0\chapter 0\doc 0\0.dll results in 4gb of allocation space.

芯好空 2024-08-12 15:22:04

认真的答案:(

非常基本上)压缩依赖于发现重复模式,因此 zip 文件将包含表示类似“

0x100000000000000000000000000000000000  
(Repeat this '0' ten trillion times)

非常短的 zip 文件,但在展开时会很大”的数据。

Serious answer:

(Very basically) Compression relies on spotting repeating patterns, so the zip file would contain data representing something like

0x100000000000000000000000000000000000  
(Repeat this '0' ten trillion times)

Very short zip file, but huge when you expand it.

高速公鹿 2024-08-12 15:22:04

文章提到了 9 层 zip 文件,因此这不是压缩一堆零的简单情况。为什么是 9 个文件,为什么每个文件有 10 个文件?

首先,维基百科文章目前说 5 层,每层 16 个文件。不确定差异从何而来,但这并不那么相关。真正的问题是为什么首先要使用嵌套。

DEFLATE 是 zip 文件* 唯一普遍支持的压缩方法,其最大压缩比为 1032。对于任何 1-3 字节的重复序列,可以渐近地实现这一点。无论您对 zip 文件做什么,只要仅使用 DEFLATE,解压后的大小最多将是原始 zip 文件大小的 1032 倍。

因此,有必要使用嵌套的 zip 文件来实现非常高的压缩率。如果有 2 层压缩,则最大比率变为 1032^2 = 1065024。对于 3 层,则为 1099104768,依此类推。对于42.zip中使用的5层来说,理论最大压缩比是1170572956434432。正如你所看到的,实际的42.zip距离这个水平还很远。部分原因是 zip 格式的开销,部分原因是他们根本不在乎。

如果我不得不猜测,我会说 42.zip 是通过创建一个大的空文件,然后反复压缩和复制它而形成的。没有尝试突破格式的限制或最大化压缩或任何其他东西 - 他们只是任意选择每层 16 个副本。重点是无需付出太多努力就能创造出巨大的有效载荷。

注意:其他压缩格式,例如 bzip2,提供非常非常大的最大压缩比。然而,大多数 zip 解析器不接受它们。

PS 可以创建一个 zip 文件,该文件将解压到其自身的副本(一个 quine)。您还可以制作一个解压缩为自身的多个副本的文件。因此,如果您永远递归地解压缩文件,则最大可能的大小是无限的。唯一的限制是每次迭代最多可以增加 1032。

PPS 1032 数字假设 zip 中的文件数据是不相交的。 zip 文件格式的一个怪癖是它有一个中央目录,其中列出了存档中的文件以及文件数据的偏移量。如果创建指向相同数据的多个文件条目,即使没有嵌套,也可以获得更高的压缩率,但这样的 zip 文件可能会被解析器拒绝。

The article mentions 9 layers of zip files, so it's not a simple case of zipping a bunch of zeros. Why 9, why 10 files in each?

First off, the Wikipedia article currently says 5 layers with 16 files each. Not sure where the discrepancy comes from, but it's not all that relevant. The real question is why use nesting in the first place.

DEFLATE, the only commonly supported compression method for zip files*, has a maximum compression ratio of 1032. This can be achieved asymptotically for any repeating sequence of 1-3 bytes. No matter what you do to a zip file, as long as it is only using DEFLATE, the unpacked size will be at most 1032 times the size of the original zip file.

Therefore, it is necessary to use nested zip files to achieve really outrageous compression ratios. If you have 2 layers of compression, the maximum ratio becomes 1032^2 = 1065024. For 3, it's 1099104768, and so on. For the 5 layers used in 42.zip, the theoretical maximum compression ratio is 1170572956434432. As you can see, the actual 42.zip is far from that level. Part of that is the overhead of the zip format, and part of it is that they just didn't care.

If I had to guess, I'd say that 42.zip was formed by just creating a large empty file, and repeatedly zipping and copying it. There is no attempt to push the limits of the format or maximize compression or anything - they just arbitrarily picked 16 copies per layer. The point was to create a large payload without much effort.

Note: Other compression formats, such as bzip2, offer much, much, much larger maximum compression ratios. However, most zip parsers don't accept them.

P.S. It is possible to create a zip file which will unzip to a copy of itself (a quine). You can also make one that unzips to multiple copies of itself. Therefore, if you recursively unzip a file forever, the maximum possible size is infinite. The only limitation is that it can increase by at most 1032 on each iteration.

P.P.S. The 1032 figure assumes that file data in the zip are disjoint. One quirk of the zip file format is that it has a central directory which lists the files in the archive and offsets to the file data. If you create multiple file entries pointing to the same data, you can achieve much higher compression ratios even with no nesting, but such a zip file is likely to be rejected by parsers.

相守太难 2024-08-12 15:22:04

要在实际环境中创建一个文件(即不在巨大的硬盘上创建 1.3 EB 文件),您可能必须学习二进制级别的文件格式,并编写一些可以转换为您想要的文件的内容,之后压缩。

To create one in a practical setting (i.e. without creating a 1.3 exabyte file on you enormous harddrive), you would probably have to learn the file format at a binary level and write something that translates to what your desired file would look like, post-compression.

自找没趣 2024-08-12 15:22:04

创建 zipbomb(或 gzbomb)的一个好方法是了解您的目标二进制格式。否则,即使您使用流文件(例如使用 /dev/zero),您仍然会受到压缩流所需的计算能力的限制。

gzip 炸弹的一个很好的例子: http://selenic.com/googolplex.gz57 (有一条消息经过几级压缩后嵌入文件中,从而产生巨大的文件)

找到该消息很有趣:)

A nice way to create a zipbomb (or gzbomb) is to know the binary format you are targeting. Otherwise, even if you use a streaming file (for example using /dev/zero) you'll still be limited by computing power needed to compress the stream.

A nice example of a gzip bomb: http://selenic.com/googolplex.gz57 (there's a message embedded in the file after several level of compression resulting in huge files)

Have fun finding that message :)

oО清风挽发oО 2024-08-12 15:22:04

没有必要使用嵌套文件,您可以利用 zip 格式来覆盖数据。

https://www.bamsoftware.com/hacks/zipbomb/

“本文展示了如何构造一个非递归 zip 炸弹,通过在 zip 容器内重叠文件来实现高压缩比。“非递归”意味着它不依赖于解压器递归解压嵌套在 zip 文件中的 zip 文件:它会在 zip 文件内完全展开。单轮解压缩的输出大小随输入大小呈二次方增加,在 zip 格式的限制下达到超过 2800 万(10 MB → 281 TB)的压缩率,使用 64 位扩展可以实现更大的扩展。构造仅使用最常见的压缩算法 DEFLATE,并且与大多数 zip 解析器兼容。”

“使用 zip 格式的压缩炸弹必须应对以下事实:DEFLATE(zip 解析器最常支持的压缩算法)无法实现大于 1032 的压缩比。因此,zip 炸弹通常依赖于递归解压缩、嵌套 zip 文件在 zip 文件中,每层都会获得额外的 1032 因子,但该技巧仅适用于递归解压缩的实现,而大多数情况下,最著名的 zip 炸弹 42.zip 会扩展至强大的 4.5 PB(如果全部六个)。它的层是递归解压缩的,但顶层的 0.6 MB 的 Zip quines,就像 Ellingsen 和 Cox 的那些,包含自身的副本,因此如果递归解压缩,则可以无限扩展,同样可以完全安全地解压缩一次。”

It is not necessary to use nested files, you can take advantage of the zip format to overlay data.

https://www.bamsoftware.com/hacks/zipbomb/

"This article shows how to construct a non-recursive zip bomb that achieves a high compression ratio by overlapping files inside the zip container. "Non-recursive" means that it does not rely on a decompressor's recursively unpacking zip files nested within zip files: it expands fully after a single round of decompression. The output size increases quadratically in the input size, reaching a compression ratio of over 28 million (10 MB → 281 TB) at the limits of the zip format. Even greater expansion is possible using 64-bit extensions. The construction uses only the most common compression algorithm, DEFLATE, and is compatible with most zip parsers."

"Compression bombs that use the zip format must cope with the fact that DEFLATE, the compression algorithm most commonly supported by zip parsers, cannot achieve a compression ratio greater than 1032. For this reason, zip bombs typically rely on recursive decompression, nesting zip files within zip files to get an extra factor of 1032 with each layer. But the trick only works on implementations that unzip recursively, and most do not. The best-known zip bomb, 42.zip, expands to a formidable 4.5 PB if all six of its layers are recursively unzipped, but a trifling 0.6 MB at the top layer. Zip quines, like those of Ellingsen and Cox, which contain a copy of themselves and thus expand infinitely if recursively unzipped, are likewise perfectly safe to unzip once."

水晶透心 2024-08-12 15:22:04

最近(1995 年之后)的压缩算法,如 bz2、lzma (7-zip) 和 rar,可以对单调的文件进行出色的压缩,单层压缩足以将超大内容包装到可管理的大小。

另一种方法可能是创建一个极端大小(艾字节)的稀疏文件,然后用可以理解稀疏文件的普通文件(例如 tar)对其进行压缩,现在,如果审查员流式传输该文件,审查员将需要读取所有存在的零仅用于填充文件的实际内容,如果检查者将其写入磁盘,则将使用很少的空间(假设行为良好的解档器和现代文件系统)。

Recent (post 1995) compression algorithms like bz2, lzma (7-zip) and rar give spectacular compression of monotonous files, and a single layer of compression is sufficient to wrap oversized content to a managable size.

Another approach could be to create a sparse file of extreme size (exabytes) and then compress it with something mundane that understands sparse files (eg tar), now if the examiner streams the file the examiner will need to read past all those zeros that exist only to pad between the actual content of the file, if the examiner writes it to disk however very little space will be used (assuming a well-behaved unarchiver and a modern filesystem).

夏日落 2024-08-12 15:22:04

硅谷第三季第七集把我带到了这里。生成拉链炸弹的步骤是。

  1. 创建一个虚拟文件,其大小为 0(如果您认为它们很小,则为 1)(例如 1 GB)。
  2. 将此文件压缩为 zip 文件,例如 1.zip
  3. 制作此文件的 n(例如 10)个副本,并将这 10 个文件添加到压缩存档(例如 2.zip)中。
  4. 重复步骤 3 k 次。
  5. 你会得到一个拉链炸弹。

对于 Python 实现,请检查

Silicon Valley Season 3 Episode 7 brought me here. The steps to generate a zip bomb would be.

  1. Create a dummy file with zeros (or ones if you think they're skinny) of size (say 1 GB).
  2. Compress this file to a zip-file say 1.zip.
  3. Make n (say 10) copies of this file and add these 10 files to a compressed archive (say 2.zip).
  4. Repeat step 3 k number of times.
  5. You'll get a zip bomb.

For a Python implementation, check this.

烂人 2024-08-12 15:22:04

所有文件压缩算法都依赖于要压缩的信息的
理论上你可以压缩一个 0 或 1 的流,如果它足够长,它会压缩得很好。

这就是理论部分。实际部分已经被其他人指出了。

All file compression algorithms rely on the entropy of the information to be compressed.
Theoretically you can compress a stream of 0's or 1's, and if it's long enough, it will compress very well.

That's the theory part. The practical part has already been pointed out by others.

月依秋水 2024-08-12 15:22:04

试过了。输出 zip 文件大小为 84 KB 小文件。

到目前为止我所做的步骤:

  1. 创建一个充满“0”的 1.4 GB .txt 文件并
  2. 对其进行压缩。
  3. 将.zip重命名为.txt,然后制作16个副本
  4. ,将其全部压缩为.zip文件,
  5. 将.zip文件中重命名的.txt文件再次重命名为.zip,
  6. 重复步骤3至5八次。
  7. 享受吧:)

虽然我不知道如何解释重命名的zip文件的压缩仍然将其压缩成更小的尺寸的部分,但它是有效的。也许我只是缺乏技术术语。

Tried it. the output zip file size was a small 84-KB file.

Steps I made so far:

  1. create a 1.4-GB .txt file full of '0'
  2. compress it.
  3. rename the .zip to .txt then make 16 copies
  4. compresse all of it into a .zip file,
  5. rename the renamed .txt files inside the .zip file into .zip again
  6. repeat steps 3 to 5 eight times.
  7. Enjoy :)

though i dont know how to explain the part where the compression of the renamed zip file still compresses it into a smaller size, but it works. Maybe i just lack the technical terms.

燕归巢 2024-08-12 15:22:04

我不知道ZIP是否使用游程编码,但如果使用的话,这样的压缩文件将包含一小段数据和非常大的游程值。游程长度值将指定小块数据重复的次数。当值非常大时,生成的数据也会成比例地变大。

I don't know if ZIP uses Run Length Encoding, but if it did, such a compressed file would contain a small piece of data and a very large run-length value. The run-length value would specify how many times the small piece of data is repeated. When you have a very large value, the resultant data is proportionally large.

已下线请稍等 2024-08-12 15:22:04

也许,在 unix 上,您可以将一定数量的零直接通过管道传输到 zip 程序或其他程序中?不过,对 unix 的了解还不够,无法解释你将如何做到这一点。除此之外,您需要一个零源,并将它们通过管道传输到从标准输入或其他东西读取的拉链中......

Perhaps, on unix, you could pipe a certain amount of zeros directly into a zip program or something? Don't know enough about unix to explain how you would do that though. Other than that you would need a source of zeros, and pipe them into a zipper that read from stdin or something...

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文