增量备份:如何跟踪文件删除

发布于 2024-10-26 19:05:11 字数 427 浏览 1 评论 0原文

我有一个异地备份解决方案,它在 C++ 上运行,将文件分成块,并在 SQLITE3 数据库上使用 md5 哈希来跟踪块。它将块与数据库一起传输到远程站点。

因此,当我想要进行恢复时,它会查询 SQLITE3 数据库并相应地恢复块。

当第一个备份运行时,它会创建一个名为 base_backup 的大表。每个后续文件更改或新文件都会作为新记录添加到新表中。如果我想进行恢复,我会查询 base_backup 表以及所有差异并恢复文件。

备份运行的方式是,它会扫描给定文件夹中的所有文件以查找存档位,如果清除了存档位,则验证数据库中是否尚不存在记录并决定是否备份它。

说到我的问题,如果本地计算机上的文件被删除,我如何跟踪它并相应地更新异地备份?因为当我进行恢复时,我不想恢复所有垃圾文件。有没有办法知道文件是否已从文件夹中删除?我不想从数据库运行验证检查,因为这会花费太长时间。

I have an offsite backup solution which runs on C++ to break the files into blocks, and keeps track of the blocks using md5 hashes on a SQLITE3 database. And it transfers the blocks along with the database to a remote site.

So, when I want to do a restore, it queries the SQLITE3 database and restores the blocks accordingly.

When the first backup runs, it creates a big table called the base_backup. Every subsequent file changes or new files are added as new records in a new table. If I want to do a restore, I query the base_backup table plus all the differences and restore the files.

The way the backup runs, it scans for all the files in a given folder for the archive bit, and if it is cleared, then verifies if a record does not already exist in the database and decides whether to back it up or not.

Coming to my question, if a file is deleted on the local computer, how do I keep track of it and update the offsite backup accordingly? Because when I do a restore, I don't want to restore all the garbage files. Is there anyway of knowing if files have been deleted from a folder or not? I do not want to run a verify check from the database since it will take too long.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

獨角戲 2024-11-02 19:05:11

inotifyIN_DELETE

inotify with IN_DELETE?

幸福还没到 2024-11-02 19:05:11

创建一个服务来监视目录(使用 FindFirstChangeNotification 或 ReadDirectoryChangesW)

Create a Service to monitor the directory (Use FindFirstChangeNotification or ReadDirectoryChangesW)

心欲静而疯不止 2024-11-02 19:05:11

您可以向数据库添加一条新信息,其中列出了上次备份期间存在的文件。然后,即使文件没有更改,备份期间也会创建一个新的(小)条目,表明它仍然存在。

从过去的给定日期恢复备份时,仅选择具有指定它们在上一次备份期间存在的条目的文件。

例如,像这样的一对表可能会起作用:

Path(text)    BackupIndex(int)
path/to/file1  1
path/to/file2  1
path/to/file1  2

请注意,path/to/file2 不会出现在备份 #2 中,因为它在备份期间不在目录中(它必须已被删除) )。

BackupIndex(int)    Timestamp(timestamp)
1                   2011-03-12 7:42:31 UTC
2                   2011-03-20 8:21:56 UTC

有人想要恢复 3 月 15 日存在的文件,您查看备份索引表,发现备份 #1 是最新的,并从路径表中查找备份 1 中存在的所有路径。

因此,基本上,您将决定文件是否被删除的时间推迟到恢复操作,而不是备份操作。

You could add a new piece of information to your database which lists which files existed during the last backup. Then, even if a file had not changed, a new (small) entry would be made during the backup, indicating that it still existed.

When restoring a backup from a given date in the past, only select the files which had entries specifying that they existed during the previous backup.

For example, a pair of tables like this might work:

Path(text)    BackupIndex(int)
path/to/file1  1
path/to/file2  1
path/to/file1  2

Notice that path/to/file2 does not appear in backup #2, as it was not in the directory during the backup (it must have been deleted).

BackupIndex(int)    Timestamp(timestamp)
1                   2011-03-12 7:42:31 UTC
2                   2011-03-20 8:21:56 UTC

Somebody wants to restore as files existed on March 15th, you look at the table of backup indices, see that backup #1 was the most recent, and look up all paths that existed in backup 1 from the paths table.

So basically, you are pushing off deciding whether a file was deleted onto the restore operation, rather than the backup operation.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文