BTRFS快照尺寸太大。它们实际上仅包含差异吗?

发布于 2025-02-13 07:31:38 字数 1928 浏览 1 评论 0原文

我对牛的刺激力学没有很好的了解,但希望它们在所有具有一个父次子量的人中包含差异和共享数据。

我制作了一个脚本来检查BTRFS快照磁盘空间消耗。

#!/usr/bin/zsh

for i in {1..2000}
do
    echo 'line'$i >> /btrfs/test-volume/btrfs-doc.txt
    /usr/bin/time -f "execution time: %E" btrfs subvolume snapshot /btrfs/test-volume /btrfs/snapshots/test-volume-snap$i
done

After running i displayed their dirs size and what i got:

❯ btrfs filesystem df /btrfs
Data, single: total=8.00MiB, used=6.84MiB
System, DUP: total=8.00MiB, used=16.00KiB
Metadata, DUP: total=102.38MiB, used=33.39MiB
GlobalReserve, single: total=3.25MiB, used=0.00B

❯ btrfs filesystem du -s /btrfs
     Total   Exclusive  Set shared  Filename
  18.54MiB     6.74MiB    36.00KiB  /btrfs

❯ df -h /btrfs
Filesystem                      Size  Used Avail Use% Mounted on
/dev/mapper/vgstoragebox-btrfs  2.0G   77M  1.8G   5% /btrfs

❯ du -sh /btrfs
20M     /btrfs

❯ ll /btrfs/test-volume/btrfs-doc.txt
-rw-r--r-- 1 root root 17K Jul  6 14:50 /btrfs/test-volume/btrfs-doc.txt

❯ tree -hU /btrfs/snapshots
/btrfs/snapshots
├── [  26]  test-volume-snap1
│   └── [   6]  btrfs-doc.txt
├── [  26]  test-volume-snap2
│   └── [  12]  btrfs-doc.txt
├── [  26]  test-volume-snap3
│   └── [  18]  btrfs-doc.txt
...
├── [  26]  test-volume-snap1998
│   └── [ 16K]  btrfs-doc.txt
├── [  26]  test-volume-snap1999
│   └── [ 16K]  btrfs-doc.txt
└── [  26]  test-volume-snap2000
    └── [ 16K]  btrfs-doc.txt

2000 directories, 2000 files

All the utils calculated size differently, i can't say how much disk space /btrfs/snapshots dir consumed actually, but i see it's much bigger than文件/btrfs/test-volume/btrfs-doc.txt至少双重大小。目前,我认为如果BTRFS快照包含diffs,并且共享数据正在链接时,我认为应该大于双重尺寸。

相比之下,我对LVM快照进行了相同的测试,它们消耗了小磁盘空间。

I don't have a good understanding of COW-snapshots mechanics but expect they contain the diffs and shared data among all of those which have one parent subvolume.

I made a script to check btrfs snapshots disk space consumption.

#!/usr/bin/zsh

for i in {1..2000}
do
    echo 'line'$i >> /btrfs/test-volume/btrfs-doc.txt
    /usr/bin/time -f "execution time: %E" btrfs subvolume snapshot /btrfs/test-volume /btrfs/snapshots/test-volume-snap$i
done

After running i displayed their dirs size and what i got:

❯ btrfs filesystem df /btrfs
Data, single: total=8.00MiB, used=6.84MiB
System, DUP: total=8.00MiB, used=16.00KiB
Metadata, DUP: total=102.38MiB, used=33.39MiB
GlobalReserve, single: total=3.25MiB, used=0.00B

❯ btrfs filesystem du -s /btrfs
     Total   Exclusive  Set shared  Filename
  18.54MiB     6.74MiB    36.00KiB  /btrfs

❯ df -h /btrfs
Filesystem                      Size  Used Avail Use% Mounted on
/dev/mapper/vgstoragebox-btrfs  2.0G   77M  1.8G   5% /btrfs

❯ du -sh /btrfs
20M     /btrfs

❯ ll /btrfs/test-volume/btrfs-doc.txt
-rw-r--r-- 1 root root 17K Jul  6 14:50 /btrfs/test-volume/btrfs-doc.txt

❯ tree -hU /btrfs/snapshots
/btrfs/snapshots
├── [  26]  test-volume-snap1
│   └── [   6]  btrfs-doc.txt
├── [  26]  test-volume-snap2
│   └── [  12]  btrfs-doc.txt
├── [  26]  test-volume-snap3
│   └── [  18]  btrfs-doc.txt
...
├── [  26]  test-volume-snap1998
│   └── [ 16K]  btrfs-doc.txt
├── [  26]  test-volume-snap1999
│   └── [ 16K]  btrfs-doc.txt
└── [  26]  test-volume-snap2000
    └── [ 16K]  btrfs-doc.txt

2000 directories, 2000 files

All the utils calculated size differently, i can't say how much disk space /btrfs/snapshots dir consumed actually, but i see it's much bigger than at least a double size of the file /btrfs/test-volume/btrfs-doc.txt. At the moment i think it should be around the double size in case the btrfs snapshots contain the diffs and shared data is linking.

In comparison, i made the same test with LVM snapshots and small disk space was consumed by them.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

撩心不撩汉 2025-02-20 07:31:38

从Userland角度来看,btrfs快照只是简单的目录,该目录包含在创建快照时子体积的文件和内容的简单目录。您可以像其他任何目录一样正常访问它们。

因此,您使用的Userland工具将与其他文件一样报告快照中的单个文件的大小。
如果您创建了同一子体积的10个快照,则Userland工具(例如du)将报告每个快照的总尺寸相同,并且对所有10个快照的总结将报告磁盘使用率为10倍的磁盘使用率为10倍最初的子卷。

:由于这些子声音的牛为生,快照中包含的文件实际上都在磁盘上共享相同的数据块。因此,尽管du将报告仅在磁盘上使用的总尺寸的10倍。


抄写后的工作方式是文件的新副本(例如,具有创建的cp -reflink)或新快照,而不仅仅是一个新的指针磁盘作为原始文件/子体积。因此,新文件将不会使用任何其他磁盘空间(除了一些其他元数据)。

只有在更改数据时,新的附加数据才会写入磁盘上的新位置,即更新文件/快照的指针,以包括该数据块。数据的所有不变部分仍与原始副本共享。

这就是为什么创建快照非常快的原因,几乎没有其他磁盘空间。但是随着时间的流逝,快照使用的磁盘空间可能会增长,因为它的参考数据障碍物与原始子卷有所不同,并且实际上将共享越来越多的数据块。


如果您想查看单个子卷之间共享/独特的数据量,则可以使用 BTRFS的配额支持功能

From a userland perspective btrfs snapshots are just simple directories containing the files and contents of the subvolume at the time the snapshot was created. You can access them normally like any other directory.

Therefore the userland tools you used will report the sizes of the individual files within the snapshot just as with any other file.
If you create say 10 snapshots of the same subvolume the userland tools such as du will report the same total size for each snapshot, and summarizing this for all 10 snapshots will report a disk usage of 10 times the size of the initial subvolume.

But: due to the CoW-nature of these subvolumes the contained files within the snapshots actually all share the same data blocks on disk. So although du will report 10 times the total size it is only used up on disk once.


The way Copy-on-Write works is that a new copy of a file (e.g. with created cp --reflink) or new snapshot is at first nothing more than a new pointer to the same physical data on disk as the original file/subvolume. So the new file will not use any additional disk space (besides some additional metadata).

Only when the data is changed the new additional data is written to a new place on the disk an the pointer of the file/snapshot is updated to include that data block. All unchanged parts of the data are still shared with the original copy.

This is why creating snapshots is very fast and uses next to no additional disk space. But over time the disk space used by a snapshot may grow since its reference data blocks diverge from the original subvolume and less and less data block will actually be shared.


If you want to see the amount of data that is shared between/unique to the individual subvolumes you can use the quota support feature of btrfs.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文