RSync 每次都会更改的单个（存档）文件

发布于 2024-10-20 13:00:31 字数 1392 浏览 12 评论 0原文

我正在开发一个开源备份实用程序，它可以备份文件并通过 FTP/SFTP/SCP 协议将它们传输到各种外部位置，例如 Amazon S3、Rackspace Cloud Files、Dropbox 和远程服务器。

现在，我收到了进行增量备份的功能请求（以防所做的备份很大并且传输和存储成本昂贵）。我环顾四周，有人提到了 rsync 实用程序。我对此进行了一些测试，但不确定这是否合适，因此希望听取任何对 rsync 有一定经验的人的意见。

让我快速概述一下备份时会发生什么。基本上它会开始转储 MySQL、PostgreSQL、MongoDB、Redis 等数据库。它可能会从文件系统中获取一些常规文件（例如图像）。一旦一切就绪，它会将所有内容捆绑在一个 .tar 中（此外，它还会使用 gzip 和 openssl 对其进行压缩和加密）。

全部完成后，我们就有了一个如下所示的文件：
mybackup.tar.gz.enc

现在我想将此文件传输到远程位置。目标是降低带宽和存储成本。因此，我们假设这个小备份包的大小约为 1GB。因此，我们使用 rsync 将其传输到远程位置并在本地删除文件备份。明天将生成一个新的备份文件，结果发现过去 24 小时内添加了很多数据，我们构建一个新的 mybackup.tar.gz.enc 文件，它看起来就像我们的大小高达 1.2GB 一样。

现在，我的问题是：是否可以仅传输过去 24 小时内添加的 200MB ？我尝试了以下命令：

rsync -vhP --append mybackup.tar.gz.enc backups/mybackup.tar.gz.enc

结果：

mybackup.tar.gz.enc 1.20G 100% 36.69MB/s 0:00:46（xfer#1，待检查=0/1）
已发送 200.01M 字节
已接收 849.40K 字节
8.14M 字节/秒
总大小1.20G
加速比为 2.01

查看发送的 200.01M 字节 我想说数据的“附加”工作正常。我现在想知道的是，它是否传输了整个1.2GB，以便确定要附加到现有备份的数量和内容，或者它真的只传输200MB？因为如果它传输整个 1.2GB 那么我看不出它与在单个大文件上使用 scp 实用程序有什么不同。

另外，如果我想要完成的事情是可能的，你推荐什么标志？如果 rsync 无法实现，您是否可以推荐使用任何实用程序？

非常感谢任何反馈！

原文

I am working on an open source backup utility that backs up files and transfers them to various external locations such as Amazon S3, Rackspace Cloud Files, Dropbox, and remote servers through FTP/SFTP/SCP protocols.

Now, I have received a feature request for doing incremental backups (in case the backups that are made are large and become expensive to transfer and store). I have been looking around and someone mentioned the rsync utility. I performed some tests with this but am unsure whether this is suitable, so would like to hear from anyone that has some experience with rsync.

Let me give you a quick rundown of what happens when a backup is made. Basically it'll start dumping databases such as MySQL, PostgreSQL, MongoDB, Redis. It might take a few regular files (like images) from the file system. Once everything is in place, it'll bundle it all in a single .tar (additionally it'll compress and encrypt it using gzip and openssl).

Once that's all done, we have a single file that looks like this:
mybackup.tar.gz.enc

Now I want to transfer this file to a remote location. The goal is to reduce the bandwidth and storage cost. So let's assume this little backup package is about 1GB in size. So we use rsync to transfer this to a remote location and remove the file backup locally. Tomorrow a new backup file will be generated, and it turns out that a lot more data has been added in the past 24 hours, and we build a new mybackup.tar.gz.enc file and it looks like we're up to 1.2GB in size.

Now, my question is: Is it possible to transfer just the 200MB that got added in the past 24 hours? I tried the following command:

rsync -vhP --append mybackup.tar.gz.enc backups/mybackup.tar.gz.enc

The result:

mybackup.tar.gz.enc 1.20G 100% 36.69MB/s 0:00:46 (xfer#1, to-check=0/1)
sent 200.01M bytes
received 849.40K bytes
8.14M bytes/sec
total size is 1.20G
speedup is 2.01

Looking at the sent 200.01M bytes I'd say the "appending" of the data worked properly. What I'm wondering now is whether it transferred the whole 1.2GB in order to figure out how much and what to append to the existing backup, or did it really only transfer the 200MB? Because if it transferred the whole 1.2GB then I don't see how it's much different from using the scp utility on single large files.

Also, if what I'm trying to accomplish is at all possible, what flags do you recommend? If it's not possible with rsync, is there any utility you can recommend to use instead?

Any feedback is much appreciated!

分享到QQ

分享到微博