如何以编程方式查找两个目录之间的差异

发布于 2024-08-16 06:51:27 字数 532 浏览 10 评论 0原文

首先;我不一定要寻找Delphi代码,你想怎么吐就怎么吐出来。

我一直在四处搜索(尤其是在这里),发现一些关于人们寻找与目录(包括子目录)进行比较的方法的信息,尽管他们使用的是逐字节方法。其次,我不是在寻找 difftool,我“只是”寻找一种方法来查找不匹配的文件,并且同样重要的是,查找位于一个目录而不是另一个目录中的文件反之亦然。

更具体地说:我有一个目录(备份文件夹),我使用 FindFirstChangeNotification 不断更新该目录。虽然第一次我需要复制所有文件,并且还需要在应用程序启动时对照原始目录检查备份目录(以防应用程序未运行或 FindFirstChangeNotification 未捕获文件更改时发生某些情况)。为了解决这个问题,我正在考虑为备份文件创建一个 CRC 列表,然后运行原始目录,计算每个文件的 CRC,最后比较两个 CRC。然后以某种方式查找一个目录中而不是另一个目录中的文件(同样;反之亦然)。

问题是:这是最快的方法吗?如果是这样,(大致)如何完成这项工作?

First off; I am not necessarily looking for Delphi code, spit it out any way you want.

I've been searching around (especially here) and found a bit about people looking for ways to compare to directories (inclusive subdirs) though they were using byte-by-byte methods. Second off, I am not looking for a difftool, I am "just" looking for a way to find files which do not match and, just as important, files which are in one directory but not the other and vice versa.

To be more specific: I have one directory (the backup folder) which I constantly update using FindFirstChangeNotification. Though the first time I need to copy all files and I also need to check the backup directory against the original when the applications starts (in case something happened when the application wasn't running or FindFirstChangeNotification didn't catch a file change). To solve this I am thinking of creating a CRC list for the backed up files and then run through the original directory computing the CRC for every file and finally compare the two CRCs. Then somehow look for files which are in one directory and not the other (again; vice versa).

Here's the question: Is this the fastest way? If so, how would one (roughly) get the job done?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

淡淡的优雅 2024-08-23 06:51:27

您不一定需要每个文件的 CRC,您只需比较每个文件的“上次修改”日期即可达到大多数正常目的。速度快得多。如果您需要额外的安全性,您还可以比较长度。您可以通过查找功能免费获得这两个指标。

在更改通知中,您可能应该将文件添加到队列中,并使用计时器对象每隔 30 秒左右复制新的排队文件,这样您就不会因频繁的更新/检查而使系统陷入困境。

为了提高速度,请尽可能使用 Win32 函数,避免使用任何 Delphi find/copy/getfileinfo 函数。我不熟悉 Delphi 框架,但例如 C# 的东西比 Win32 函数慢得多。

You don't necessarily need CRCs for each file, you can just compare the "last modified" date for every file for most normal purposes. It's WAY faster. If you need additional safety, you can also compare the lengths. You get both of these metrics for free with the find functions.

And in your change notification, you should probably add the files to a queue and use a timer object to copy the new queued files every ~30sec or something, so you don't bog down the system with frequent updates/checks.

For additional speed, use the Win32 functions wherever possible, avoid any Delphi find/copy/getfileinfo functions. I'm not familiar with the Delphi framework but for example the C# stuff is WAY WAY WAY slower than the Win32 functions.

鹊巢 2024-08-23 06:51:27

不管您“不寻找 difftool”,您是否反对使用 Cygwin 及其 shell 的“diff”命令?如果您对此持开放态度,那么这很容易,特别是使用 diff 和 -r“递归”选项。

以下代码生成我的计算机上 2 个 Rails 安装之间的差异,并且不仅显示有关文件之间差异的信息,而且特别是通过 grep 查找“仅”,在一个目录中查找文件,而不是在另一个目录中查找文件:

$ diff -r pgnindex pgnonrails | egrep '^Only|diff'
Only in pgnindex/app/controllers: openings_controller.rb
Only in pgnindex/app/helpers: openings_helper.rb
Only in pgnindex/app/views: openings
diff -r pgnindex/config/environment.rb pgnonrails/config/environment.rb
diff -r pgnindex/config/initializers/session_store.rb pgnonrails/config/initializers/session_store.rb
diff -r pgnindex/log/development.log pgnonrails/log/development.log
Only in pgnindex/test/functional: openings_controller_test.rb
Only in pgnindex/test/unit: helpers

Regardless of you "not looking for a difftool", are you opposed to using Cygwin with it's "diff" command for the shell? If you are open to this its quite easy, particularly using diff with the -r "recursive" option.

The following generates the differences between 2 Rails installs on my machine, and greps out not only information about differences between files but also, specifically by grepping for 'Only', finds files in one directory, but not the other:

$ diff -r pgnindex pgnonrails | egrep '^Only|diff'
Only in pgnindex/app/controllers: openings_controller.rb
Only in pgnindex/app/helpers: openings_helper.rb
Only in pgnindex/app/views: openings
diff -r pgnindex/config/environment.rb pgnonrails/config/environment.rb
diff -r pgnindex/config/initializers/session_store.rb pgnonrails/config/initializers/session_store.rb
diff -r pgnindex/log/development.log pgnonrails/log/development.log
Only in pgnindex/test/functional: openings_controller_test.rb
Only in pgnindex/test/unit: helpers
还给你自由 2024-08-23 06:51:27

将本地计算机上的一个目录与千里之外的另一台计算机上的目录进行比较的最快方法正是按照您的建议:

  • 为每个文件生成 CRC/校验和,
  • 通过网络发送每个文件的名称、路径和 CRC/校验和。互联网与另一台机器的
  • 比较

也许最简单的方法是使用 rsync 与“- -dryrun”或“--list-only”选项。
(或者使用使用 rsync 算法的众多应用程序之一,
或者将 rsync 算法编译到您的应用程序中)。

cd some_backup_directory
rsync --dryrun myname@remote_host:latest_version_directory .

为了速度,默认的 rsync 假设,正如 Blindy 建议的那样,两个具有相同名称、相同路径、相同长度和相同修改时间的文件是相同的。
为了额外的安全性,您可以为 rsync 提供“--checksum”选项来忽略长度和修改时间并强制它比较文件的实际内容(的校验和)。

The fastest way to compare one directory on the local machine to a directory on another machine thousands of miles away is exactly as you propose:

  • generate a CRC/checksum for every file
  • send the name, path, and CRC/checksum for each file over the internet to the other machine
  • compare

Perhaps the easiest way to do that is to use rsync with the "--dryrun" or "--list-only" option.
(Or use one of the many applications that use the rsync algorithm,
or compile the rsync algorithm into your application).

cd some_backup_directory
rsync --dryrun myname@remote_host:latest_version_directory .

For speed, the default rsync assumes, as Blindy suggested, that two files with the same name and the same path and the same length and the same modification time are the same.
For extra safety, you can give rsync the "--checksum" option to ignore the length and modification time and force it to compare (the checksum of) the actual contents of the file.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文