当前位置：文江博客话题详情

UNIX tar archive

为什么 TAR 文件比其内容小？

发布于 07-13 04:00 字数 372 浏览 9 评论 0原文

我有一个正在归档的目录：

$ du -sh oldcode
1400848
$ tar cf oldcode.tar oldcode

所以该目录是 1.4gb。不过，该文件要小得多：

$ ls -l oldcode.tar
-rw-r--r-- 1 ieure ieure 940339200 2002-01-30 10:33 oldcode.tar

只有 897mb。它没有以任何方式压缩：

$ file oldcode.tar
oldcode.tar: POSIX tar archive

为什么 tar 文件比其内容小？

I have a directory I’m archiving:

$ du -sh oldcode
1400848
$ tar cf oldcode.tar oldcode

So the directory is 1.4gb. The file is significantly smaller, though:

$ ls -l oldcode.tar
-rw-r--r-- 1 ieure ieure 940339200 2002-01-30 10:33 oldcode.tar

Only 897mb. It’s not compressed in any way:

$ file oldcode.tar
oldcode.tar: POSIX tar archive

Why is the tar file smaller than its contents?

收藏 0

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

评论（5）

夢归不見2024-07-20 04:00:50

由于文件系统的工作方式，您会得到不同的结果。

简而言之，您的磁盘是由簇组成的。每个簇的固定大小为 4 KB。如果在这样的簇中存储 1kb 文件，则 3kb 将未被使用。确切的细节因您使用的文件系统的类型而异，但大多数文件系统都是这样工作的。

对于单个文件来说，3kb 浪费的空间并不算多，但如果您有很多非常小的文件，则浪费可能会成为磁盘使用的重要组成部分。

在 tar 归档文件中，文件不是存储在簇中，而是一个接一个地存储。这就是差异的来源。

回复收藏 0 原文

十六岁半2024-07-20 04:00:50

由于不知道您正在使用什么 tar 或您正在使用哪种 Unix 系统，我的猜测是：oldcode 包含许多较小的文件，这些文件本身使用磁盘空间效率低下，因为磁盘空间是由某种块分配的，而不是逐字节。在 tar 文件中，它们被连接起来，并最大限度地利用分配给它们的磁盘空间。

回复收藏 0 原文

寻找我们的幸福2024-07-20 04:00:50

这与文件系统的块大小有关。 MacOSX 10.5.6 上的 man 1 du 指出：

du 实用程序显示每个文件参数以及以每个目录参数为根的文件层次结构中每个目录的文件系统块使用情况。如果不指定文件，则显示以当前目录为根的层次结构的块使用情况。

[mirko@borg foo]$ ls -la
total 0
drwxr-xr-x   2 mirko  wheel   68 Jan 30 21:20 .
drwxrwxrwt  10 root   wheel  340 Jan 30 21:16 ..
[mirko@borg foo]$ du -sh
0B  .
[mirko@borg foo]$ touch foo
[mirko@borg foo]$ ls -la
total 0
drwxr-xr-x   3 mirko  wheel  102 Jan 30 21:20 .
drwxrwxrwt  10 root   wheel  340 Jan 30 21:16 ..
-rw-r--r--   1 mirko  wheel    0 Jan 30 21:20 foo
[mirko@borg foo]$ du -sh
0B  .
[mirko@borg foo]$ echo 1 > foo
[mirko@borg foo]$ ls -la
total 8
drwxr-xr-x   3 mirko  wheel  102 Jan 30 21:20 .
drwxrwxrwt  10 root   wheel  340 Jan 30 21:16 ..
-rw-r--r--   1 mirko  wheel    2 Jan 30 21:20 foo
[mirko@borg foo]$ du -sh
4.0K    .

正如您所见，即使是 2 字节的文件也需要一整块 4kb 的空间。有些文件系统通过块再分配来避免这种空间浪费。

This has something to do with the blocksize of your filesystem. man 1 du on MacOSX 10.5.6 states:

The du utility displays the file system block usage for each file argument and for each directory in the file hierarchy rooted in each directory argument. If no file is specified, the block usage of the hierarchy rooted in the current directory is displayed.

[mirko@borg foo]$ ls -la
total 0
drwxr-xr-x   2 mirko  wheel   68 Jan 30 21:20 .
drwxrwxrwt  10 root   wheel  340 Jan 30 21:16 ..
[mirko@borg foo]$ du -sh
0B  .
[mirko@borg foo]$ touch foo
[mirko@borg foo]$ ls -la
total 0
drwxr-xr-x   3 mirko  wheel  102 Jan 30 21:20 .
drwxrwxrwt  10 root   wheel  340 Jan 30 21:16 ..
-rw-r--r--   1 mirko  wheel    0 Jan 30 21:20 foo
[mirko@borg foo]$ du -sh
0B  .
[mirko@borg foo]$ echo 1 > foo
[mirko@borg foo]$ ls -la
total 8
drwxr-xr-x   3 mirko  wheel  102 Jan 30 21:20 .
drwxrwxrwt  10 root   wheel  340 Jan 30 21:16 ..
-rw-r--r--   1 mirko  wheel    2 Jan 30 21:20 foo
[mirko@borg foo]$ du -sh
4.0K    .

As you see even a file of 2 bytes takes a whole block of 4kb. There are some filesystems which avoid this waste of space by block suballocation.

回复收藏 0 原文

紫罗兰の梦幻2024-07-20 04:00:50

有两种可能性。

小文件

最有可能的是，它不小于其内容。正如 Nils Pipenbrinck 所写，du 显示文件系统分配的空间量，因为文件是存储在文件系统块中的大小大于文件的逻辑大小。

要查看文件的逻辑大小，请使用du --apparent-size。在这种情况下，结果应该小于 tar 文件。

稀疏文件

Tar 文件可以存储稀疏文件。如果 tarball 是使用 --sparse 创建的，稀疏文件中的漏洞将被记录，因此 tarball 可能小于文件的逻辑大小。

如果提取的副本中的稀疏信息以某种方式丢失（例如，如果您将 tarball 提取到不支持稀疏文件的文件系统上，或者如果它被压缩然后解压缩等），则 df 将报告扩展后的大小。

回复收藏 0 原文

银河中√捞星星2024-07-20 04:00:50

du 计算的是磁盘块，而不是文件大小。

回复收藏 0 原文

~没有更多了~

关于作者

东北女汉子

暂无简介

0 文章

0 评论

23 人气

关注发私信

相关话题

热门标签

操作系统程序设计 IT运维 Linux系统管理 JavaScript 服务器应用 solaris C/C++ PHP Shell BSD Vue.js aix Oracle Python HTML 系统管理 HTML5 CSS 前端

推荐作者

巷子口的你

文章 0 评论 0

微信用户

文章 0 评论 0

神妖

文章 0 评论 0

鞋纸虽美，但不合脚ㄋ〞

文章 0 评论 0

7460852697

文章 0 评论 0

ligengkai

文章 0 评论 0

友情链接

小文件
稀疏文件
Small files
Sparse files

我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的隐私政策了解更多相关信息。单击 接受 或继续使用网站，即表示您同意使用 Cookies 和您的相关数据。

原文