如何比较两个卷并列出修改的文件?
我有2个硬盘卷(一个是另一个的备份映像),我想比较这些卷并列出所有修改的文件,以便用户可以选择他/她想要回滚的文件。
目前,我正在递归新卷并将每个文件的时间戳与旧卷的文件进行比较(如果它们位于旧卷中)。显然这是一种错误的做法。这既耗时又错误!
有没有一种有效的方法来做到这一点?
编辑:
- 我正在使用 FindFirstFile 并且喜欢递归卷,并收集每个文件的信息(不是很慢,只需几分钟)。
- 我正在使用卷影复制进行备份。
- 备份卷是远程的,因此我无法连续监控实际卷。
I have 2 hard-disk volumes(one is a backup image of the other), I want to compare the volumes and list all the modified files, so that the user can select the ones he/she wants to roll-back.
Currently I'm recursing through the new volume and comparing each file's time-stamps to the old volume's files (if they are int the old volume). Obviously this is a blunder approach. It's time consuming and wrong!
Is there an efficient way to do it?
EDIT:
- I'm using FindFirstFile and likes to recurse the volume, and gather info of each file (not very slow, just a few minutes).
- I'm using Volume Shadow Copy to backup.
- The backup-volume is remote so I cannot continuously monitor the actual volume.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
这部分取决于两卷的复制方式;如果从文件系统的角度来看它们是“真实”副本(例如卷影副本或其他块级副本),您可以对 USN 做一些棘手的小事情,这是其他人建议您研究的通用技术。您可能想查看类似 FSCTL_READ_FILE_USN_DATA 的 API ,例如。该 API 可以让您比较文件的两个不同副本(再次假设它们是同一个文件,具有来自块级备份的相同文件参考号)。如果您想在很大程度上实现无状态,那么这个 API 和类似的 API 将会对您有很大帮助。我的算法看起来像这样:
综上所述,我的经验让我相信这将比我即兴解释所暗示的更复杂。不过,这可能是一个很好的起点。
如果卷不是彼此的块级副本,那么比较 USN 编号和文件 ID 将非常困难(如果不是不可能的话)。相反,您很可能会按文件名进行操作,如果不打开每个文件,这将是很困难的(如果不是不可能的话)(时间可以由应用程序修改,大小和时间在 findfirst/next 查询中可能会过时,并且您必须处理删除然后重新创建的案例、重命名案例等)。
因此,了解您对环境的控制程度非常重要。
Part of this depends upon how the two volumes are duplicated; if they are 'true' copies from the file system's point of view (e.g. shadow copies or other block-level copies), you can do a few tricky little things with respect to USN, which is the general technology others are suggesting you look into. You might want to look at an API like FSCTL_READ_FILE_USN_DATA, for example. That API will let you compare two different copies of a file (again, assuming they are the same file with the same file reference number from block-level backups). If you wanted to be largely stateless, this and similar APIs would help you a lot here. My algorithm would look something like this:
All of that said, my experience leads me to believe that this will be more complicated than my off-the-cuff explanation hints at. This might be a good starting place, though.
If the volumes are not block-level copies of one another, then it will be very difficult to compare USN numbers and file IDs, if not impossible. Instead, you may very well be going by file name, which will be difficult if not impossible to do without opening every file (times can be modified by apps, sizes and times can be out of date in the findfirst/next queries, and you have to handle deleted-then-recreated cases, rename cases, etc.).
So knowing how much control you have over the environment is pretty important.
我不会等到更改发生后,然后扫描整个磁盘来查找已更改的(通常很少)文件,而是设置一个程序来使用
ReadDirectoryChangesW
监控发生的变化。这将使您能够以最少的麻烦和麻烦来构建文件列表。Instead of waiting until after changes have happened, and then scanning the whole disk to find the (usually few) files that have changed, I'd set up a program to use
ReadDirectoryChangesW
to monitor changes as they happen. This will let you build a list of files with a minimum of fuss and bother.假设您没有将新卷上的每个文件与快照中的每个文件进行比较,这是您可以做到的唯一方法。在不查看所有文件的情况下,如何找到哪些文件未被修改?
Assuming you're not comparing each file on the new volume to every file in the snapshot, that's the only way you can do it. How are you going to find which files aren't modified without looking at all of them?
我不是 Windows 程序员。
但是,您不应该有 stat 函数来检索文件的修改时间。
根据修改时间对文件进行排序。
修改时间大于上次备份时间的文件是您感兴趣的文件。
您第一次可以迭代备份卷,从您感兴趣的集合中找出最大修改时间和创建时间。
我假设备份卷中感兴趣的目录没有被修改。
I am not a Windows programmer.
However shouldn't u have stat function to retrieve the modified time of a file.
Sort the files based on mod time.
The files having mod time greater than your last backup time are the ones of your interest.
For the first time u can iterate over the back up volume to figure out the max mod time and created time from your interested set.
I am assuming the directories of interest don't get modified in the backup volume.
如果不知道您在这里要做什么的更多细节,就很难说。不过,关于我认为您想要实现的目标的一些提示:
Without knowing more details about what you're trying to do here, it's hard to say. However, some tips about what I think you're trying to achieve: