如何比较目录以确定哪些文件已更改?
我们需要一个脚本来比较两个文件目录,并且对于目录 1 和目录 2 之间已更改(添加、删除、修改)的每个文件,需要仅创建那些已修改文件的子集。
我的第一印象是创建一个 python 脚本来遍历每个目录,计算每个文件的哈希值,如果哈希值已更改,则将文件复制到新的文件子集。这是正确的做法吗?我是否忽略了任何可以做到这一点的工具?我从未使用过它,但也许可以使用 rsync 之类的东西?
谢谢
编辑:
重要的是我能够编译仅更改的那些文件的子集 - 因此,如果版本之间只有 3 个文件发生更改,我只需要将这三个文件复制到新目录...
We need a script that will compare two directories of files and for each file that has been altered between directory 1 and directory 2 (added, deleted, modified), need to create a subset of only those modified files.
My first impression is to create a python script to traverse each directory, compute a hash of each file, and if the hash has changed, copy the file over to the new subset of files. Is this a proper approach? Am I neglecting any tools out there which may do this already? I've never used it, but maybe use something like rsync could be used?
Thanks
Edit:
The important part is that I am able to compile a subset of only those files were changed-- so if a only 3 files have changed between versions, I only need those three files copied to a new directory...
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
在我看来,你需要的东西就这么简单:
It seems to me that you need something as simple as that:
这是一种完全合理的方法,但您本质上是在重新发明 rsync。所以是的,使用 rsync。
编辑:有一种方法可以创建“差异-仅”使用
rsync
的文件夹That is one completely reasonable approach, but you are essentially reinventing rsync. So yes, use rsync.
edit: There's a way to create "difference-only" folders using
rsync
我喜欢 diffmerge,它非常适合此目的。
I like diffmerge, it works great for this purpose.
我修改了 @eyquem 答案!
参数可以给出为
注意:根据修改时间排序!
I have modified @eyquem answer a bit!
Arguments can be given as
NOTE : sorts on basis of modification time !
包括子文件夹并比较文件的哈希值(需要 >Python 3.11)
Including Subfolders and comparing hashes of the files (>Python 3.11 required)