是否有内置冗余的反向增量备份解决方案(例如par2)?
我正在设置一个家庭服务器主要用于备份。我有大约 90GB 的个人数据,必须以最可靠的方式进行备份,同时仍保留磁盘空间。我想要完整的文件历史记录,以便我可以在任何特定日期返回到任何文件。
由于数据的大小,每周完整备份不是一种选择。相反,我正在寻找增量备份解决方案。但是,我知道一组增量备份中的单个损坏会使整个系列(超出某一点)无法恢复。因此,简单的增量备份不是一个选择。
我研究了许多解决该问题的方法。首先,我会使用反向增量备份,以便最新版本的文件丢失的可能性最小(较旧的文件并不那么重要)。其次,我想通过某种冗余来保护增量和备份。 Par2 奇偶校验数据似乎非常适合这项工作。简而言之,我正在寻找具有以下要求的备份解决方案:
- 反向增量(以节省磁盘空间并优先考虑最近的备份)
- 文件历史记录(一种更广泛的类别,包括反向增量)
- 关于增量和备份的 Par2 奇偶校验数据数据
- 保留元数据
- 有效利用带宽(节省带宽;无需为每个增量复制整个目录)。大多数增量备份解决方案应该以这种方式工作。
(我相信)这将确保文件完整性和相对较小的备份大小。我已经研究过许多备份解决方案,但它们有很多问题:
- Bacula - 简单的普通增量备份
- bup - 增量并实现 par2,但不是反向增量,并且不保留元数据的
- 重复性 - 增量、压缩和加密但不是反向增量
- dar - 增量和 par2 很容易添加,但不是反向增量且没有文件历史记录吗?
- rdiff-backup - 几乎完美满足我的需要,但它不支持 par2
到目前为止,我认为 rdiff-backup 似乎是最好的妥协,但它不支持 par2。我想我可以很容易地将 par2 支持添加到备份增量中,因为它们不会修改每个备份,但是其余文件呢?我可以为备份中的所有文件递归生成 par2 文件,但这会很慢且效率低下,而且我必须担心备份和旧 par2 文件期间的损坏。特别是,我无法区分已更改的文件和损坏的文件之间的区别,并且我不知道如何检查此类错误或它们将如何影响备份历史记录。有谁知道有更好的解决方案吗?有没有更好的方法来解决这个问题?
感谢您阅读我的困难以及您可以给我的任何意见。任何帮助将不胜感激。
I'm setting a home server primarily for backup use. I have about 90GB of personal data that must be backed up in the most reliable manner, while still preserving disk space. I want to have full file history so I can go back to any file at any particular date.
Full weekly backups are not an option because of the size of the data. Instead, I'm looking along the lines of an incremental backup solution. However, I'm aware that a single corruption in a set of incremental backups makes the entire series (beyond a point) unrecoverable. Thus simple incremental backups are not an option.
I've researched a number of solutions to the problem. First, I would use reverse-incremental backups so that the latest version of the files would have the least chance of loss (older files are not as important). Second, I want to protect both the increments and backup with some sort of redundancy. Par2 parity data seems perfect for the job. In short, I'm looking for a backup solution with the following requirements:
- Reverse incremental (to save on disk space and prioritize the most recent backup)
- File history (kind of a broader category including reverse incremental)
- Par2 parity data on increments and backup data
- Preserve metadata
- Efficient with bandwidth (bandwidth saving; no copying the entire directory over for each increment). Most incremental backup solutions should work this way.
This would (I believe) ensure file integrity and relatively small backup sizes. I've looked at a number of backup solutions already but they have a number of problems:
- Bacula - Simple normal incremental backups
- bup - incremental and implements par2 but isn't reverse incremental and doesn't preserve metadata
- duplicity - incremental, compressed, and encrypted but isn't reverse incremental
- dar - incremental and par2 is easy to add, but isn't reverse incremental and no file history?
- rdiff-backup - almost perfect for what I need but it doesn't have par2 support
So far I think that rdiff-backup seems like the best compromise but it doesn't support par2. I think I can add par2 support to backup increments easily enough since they aren't modified each backup but what about the rest of the files? I could generate par2 files recursively for all files in the backup but this would be slow and inefficient, and I'd have to worry about corruption during a backup and old par2 files. In particular, I couldn't tell the difference between a changed file and a corrupt file, and I don't know how to check for such errors or how they would affect the backup history. Does anyone know of any better solution? Is there a better approach to the issue?
Thanks for reading through my difficulties and for any input you can give me. Any help would be greatly appreciated.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
http://www.timedicer.co.uk/index
使用 rdiff-backup 作为引擎。我一直在研究它,但这需要我使用 Linux 或虚拟机设置一个“服务器”。
就我个人而言,我使用 WinRAR 进行伪增量备份(它实际上对最近的文件进行完整备份),每天通过计划任务运行。它类似地是“推送”备份。
它不是真正的增量(或反向增量),但它根据上次更新的时间保存不同版本的文件。我的意思是,即使文件相同,它也会保存今天、昨天和前几天的版本。您可以设置存档位以节省空间,但我不再打扰,因为我备份的只是小型电子表格和文档。
RAR 有自己的奇偶校验或恢复记录,您可以设置大小或百分比。我用的是1%(百分之一)。
它可以保留元数据,我个人跳过高分辨率时间。
它可以非常高效,因为它可以压缩文件。
然后我所要做的就是将文件发送到我的备份中。我将其复制到不同的驱动器和网络中的另一台计算机。不需要真正的服务器,只需共享。尽管 Windows 工作站有 10 个连接的限制,但您无法对太多计算机执行此操作。
因此,出于我的目的(可能适合您的目的),每天备份我的文件以查找过去 7 天内更新的文件。然后我有另一个计划备份,每月或每 30 天运行一次,备份过去 90 天内更新的文件。
但我使用 Windows,所以如果您实际上要设置 Linux 服务器,您可能会查看 Time Dicer。
http://www.timedicer.co.uk/index
Uses rdiff-backup as the engine. I've been looking at it, but that requires me to set up a "server" using linux or a virtual machine.
Personally, I use WinRAR to make pseudo-incremental backups (it actually makes a full backup of recent files) run daily by a scheduled task. It is similarly a "push" backup.
It's not a true incremental (or reverse-incremental) but it saves different versions of files based on when it was last updated. I mean, it saves the version for today, yesterday and the previous days, even if the file is identical. You can set the archive bit to save space, but I don't bother anymore as all I backup are small spreadsheets and documents.
RAR has it's own parity or recovery record that you can set in size or percentage. I use 1% (one percent).
It can preserve metadata, I personally skip the high resolution times.
It can be efficient since it compresses the files.
Then all I have to do is send the file to my backup. I have it copied to a different drive and to another computer in the network. No need for a true server, just a share. You can't do this for too many computers though as Windows workstations have a 10 connection limit.
So for my purpose, which may fit yours, backs up my files daily for files that have been updated in the last 7 days. Then I have another scheduled backup that backups files that have been updated in the last 90 days run once a month or every 30 days.
But I use Windows, so if you're actually setting up a Linux server, you might check out the Time Dicer.
由于没有人能够回答我的问题,我将写下我在研究该主题时发现的一些可能的解决方案。简而言之,我相信最好的解决方案是 rdiff 备份到 ZFS 文件系统。原因如下:
就我个人而言,我没有使用这个解决方案,因为 ZFS 在 Linux 上工作有点棘手。 Btrfs 看起来很有前途,但多年的使用尚未证明其稳定性。相反,我会选择一种更便宜的选择,即简单地检查硬盘驱动器 SMART 数据。硬盘驱动器应该进行一些错误检查/自我纠正,通过监视这些数据,我可以看到这个过程是否正常工作。它不如额外的文件系统奇偶校验那么好,但总比没有好。
对于寻求可靠备份开发的人们来说,还有一些注释可能会感兴趣:
Since nobody was able to answer my question, I'll write a few possible solutions I found while researching the topic. In short, I believe the best solution is rdiff-backup to a ZFS filesystem. Here's why:
Personally I am not using this solution as ZFS is a little tricky to get working on Linux. Btrfs looks promising but hasn't been proven stable from years of use. Instead, I'm going with a cheaper option of simply checking hard drive SMART data. Hard drives should do some error checking/correcting themselves and by monitoring this data I can see if this process is working properly. It's not as good as additional filesystem parity but better than nothing.
A few more notes that might be interesting to people looking into reliable backup development: