将文件系统写入记录为远程应用程序的二进制差异?
我正在使用不断更新的~20GB Interbase 备份,我想通过互联网复制这些备份。如何最大限度地减少传输的数据?
我正在考虑使用二进制 diff 工具,但我知道 bsdiff
至少需要 O(7n) 内存,并且这些 Interbase 备份仅使用 Interbase 专有的 gbak
无论如何。有什么方法可以连接到 Linux 文件系统(ext/btrfs/...)来捕获对此文件所做的所有更改,将其导出为通用 diff 格式,然后在不同的(Windows)平台上重新组装它?
I'm working with continuously updated ~20GB Interbase backups that i want to replicate over the internet. How can i minimise data transferred?
I was considering using a binary diff tool, but i understand bsdiff
requires O(7n) memory at least, and these Interbase backups are only changing incrementally over the LAN using Interbase's proprietary gbak
thing anyway. Is there any way i can hook into the linux filesystem (ext/btrfs/...) to capture all changes made to this file, export that as a common diff format, and reassemble it on a different (windows) platform?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
InterBase数据库增量备份功能怎么样?您可以尝试将增量备份(对于日志文件)到临时转储位置,然后将该增量数据单独备份到异地位置。无论如何,您可能需要保留初始的完整备份数据才能继续进行InterBase 数据库的增量备份。
它将为您提供非常少量的需要备份的数据。
How about InterBase databases incremental backup feature? You can try to do the incremental backup (for log files) to an temporary dump location and then backup that incremental data alone to the offsite location. Any way, you may need to keep the initial full backup data to proceed with the incremental backup of InterBase databases.
It will give you the very little amount of data to be backed up.
您也许可以使用
rsync
。如果数据库中的更改恰好保存到备份文件的末尾,那就完美了。但是,如果备份文件被高度重写(我的意思是随机插入/删除/修改许多小块/行),rsync 将无法完成这项工作。这取决于数据库中插入/删除的同步频率。
在这种情况下,诸如 xdelta 之类的工具可能会有所帮助,因为它们使用窗口方法来进行增量计算,并且可能能够找到比 rsync 小得多的公共部分,并且因此,尽管存在更高密度的变化,但仍保留共同部分。您需要一个“旧”和最新的备份才能使用它。
好消息是,每次执行备份时,备份可能会以相同的方式组织(相同的表/行顺序),这对两种算法都有帮助。
you might be able to use
rsync
. If changes in the database happen to be saved to the end of the backup file, it will be perfect.however, if the backup file gets highly rewritten (I mean with many small chunks/rows inserted/deleted/modified randomly),
rsync
will not do the job. It depends on the frequency of the synchronization with respect to that of the insertions/deletions in your database.there are tools such as
xdelta
which might help in this case, as they use a windowed approach to the delta computation and might be able to find common pieces much smaller thanrsync
and thus keep the common part although the presence of a higher density of changes. you'll need a 'old' and the latest backup to use this.The good news is that backup will probably be organized in the same way each time it is executed (same tables/rows order) and it will help both algorithms.