增量备份：如何跟踪文件删除

发布于 2024-10-26 19:05:11 字数 427 浏览 11 评论 0原文

我有一个异地备份解决方案，它在 C++ 上运行，将文件分成块，并在 SQLITE3 数据库上使用 md5 哈希来跟踪块。它将块与数据库一起传输到远程站点。

因此，当我想要进行恢复时，它会查询 SQLITE3 数据库并相应地恢复块。

当第一个备份运行时，它会创建一个名为 base_backup 的大表。每个后续文件更改或新文件都会作为新记录添加到新表中。如果我想进行恢复，我会查询 base_backup 表以及所有差异并恢复文件。

备份运行的方式是，它会扫描给定文件夹中的所有文件以查找存档位，如果清除了存档位，则验证数据库中是否尚不存在记录并决定是否备份它。

说到我的问题，如果本地计算机上的文件被删除，我如何跟踪它并相应地更新异地备份？因为当我进行恢复时，我不想恢复所有垃圾文件。有没有办法知道文件是否已从文件夹中删除？我不想从数据库运行验证检查，因为这会花费太长时间。

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

獨角戲 2024-11-02 19:05:11

inotify 与 IN_DELETE？

回复收藏 0 原文

幸福还没到 2024-11-02 19:05:11

创建一个服务来监视目录（使用 FindFirstChangeNotification 或 ReadDirectoryChangesW）

回复收藏 0 原文

心欲静而疯不止 2024-11-02 19:05:11

您可以向数据库添加一条新信息，其中列出了上次备份期间存在的文件。然后，即使文件没有更改，备份期间也会创建一个新的（小）条目，表明它仍然存在。

从过去的给定日期恢复备份时，仅选择具有指定它们在上一次备份期间存在的条目的文件。

例如，像这样的一对表可能会起作用：

Path(text)    BackupIndex(int)
path/to/file1  1
path/to/file2  1
path/to/file1  2

请注意，path/to/file2 不会出现在备份 #2 中，因为它在备份期间不在目录中（它必须已被删除））。

BackupIndex(int)    Timestamp(timestamp)
1                   2011-03-12 7:42:31 UTC
2                   2011-03-20 8:21:56 UTC

有人想要恢复 3 月 15 日存在的文件，您查看备份索引表，发现备份 #1 是最新的，并从路径表中查找备份 1 中存在的所有路径。

因此，基本上，您将决定文件是否被删除的时间推迟到恢复操作，而不是备份操作。

You could add a new piece of information to your database which lists which files existed during the last backup. Then, even if a file had not changed, a new (small) entry would be made during the backup, indicating that it still existed.

When restoring a backup from a given date in the past, only select the files which had entries specifying that they existed during the previous backup.

For example, a pair of tables like this might work:

Path(text)    BackupIndex(int)
path/to/file1  1
path/to/file2  1
path/to/file1  2

Notice that path/to/file2 does not appear in backup #2, as it was not in the directory during the backup (it must have been deleted).

BackupIndex(int)    Timestamp(timestamp)
1                   2011-03-12 7:42:31 UTC
2                   2011-03-20 8:21:56 UTC

Somebody wants to restore as files existed on March 15th, you look at the table of backup indices, see that backup #1 was the most recent, and look up all paths that existed in backup 1 from the paths table.

So basically, you are pushing off deciding whether a file was deleted onto the restore operation, rather than the backup operation.

回复收藏 0 原文

~没有更多了~