使用 PowerShell 比较文件夹和内容
我有两个不同的文件夹,其中包含 xml 文件。与另一个文件夹(folder1)相比,一个文件夹(folder2)包含更新的和新的 xml 文件。我需要知道与folder1 相比,folder2 中的哪些文件是新的/更新的,并将它们复制到第三个文件夹(folder3)。在 PowerShell 中实现此目的的最佳方法是什么?
I have two different folders with xml files. One folder (folder2) contains updated and new xml files compared to the other (folder1). I need to know which files in folder2 are new/updated compared to folder1 and copy them to a third folder (folder3). What's the best way to accomplish this in PowerShell?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(7)
好吧,我不会为您编写整个代码(这有什么乐趣?),但我会帮助您开始。
首先,有两种方法可以进行内容比较。懒惰/最正确的方法是比较文件的长度;准确但更复杂的方法是比较每个文件内容的哈希值。
为了简单起见,让我们采用简单的方法来比较文件大小。
基本上,您需要两个对象来代表源文件夹和目标文件夹:
然后您可以使用Compare-Object 来查看哪些项目不同...
Compare-Object $Folder1 $Folder2 -Property Name , Length
它将通过仅比较每个集合中文件对象的名称和长度来为您列出所有不同的内容。
您可以将其通过管道传输到
Where-Object
过滤器,以选择左侧不同的内容...Compare-Object $Folder1 $Folder2 -Property Name, Length | Where-Object {$_.SideIndicator -eq "<="}
然后将其通过管道传输到
ForEach-Object
以复制您想要的位置:OK, I'm not going to code the whole thing for you (what's the fun in that?) but I'll get you started.
First, there are two ways to do the content comparison. The lazy/mostly right way, which is comparing the length of the files; and the accurate but more involved way, which is comparing a hash of the contents of each file.
For simplicity sake, let's do the easy way and compare file size.
Basically, you want two objects that represent the source and target folders:
Then you can use
Compare-Object
to see which items are different...Compare-Object $Folder1 $Folder2 -Property Name, Length
which will list for you everything that is different by comparing only name and length of the file objects in each collection.
You can pipe that to a
Where-Object
filter to pick stuff that is different on the left side...Compare-Object $Folder1 $Folder2 -Property Name, Length | Where-Object {$_.SideIndicator -eq "<="}
And then pipe that to a
ForEach-Object
to copy where you want:使用 MD5 哈希的递归目录差异(比较内容)
这是一个纯 PowerShell v3+ 递归文件差异(无依赖项),用于计算每个目录文件内容(左/右)的 MD5 哈希。可以选择导出 CSV 以及摘要文本文件。默认将结果输出到标准输出。可以将 rdiff.ps1 文件放入您的路径或将内容复制到您的脚本中。
用法:rdiff path/to/left,path/to/right [-s path/to/summary/dir]
这是要点。建议使用要点版本,因为随着时间的推移它可能会具有附加功能。请随意发送拉取请求。
Recursive Directory Diff Using MD5 Hashing (Compares Content)
Here is a pure PowerShell v3+ recursive file diff (no dependencies) that calculates MD5 hash for each directories file contents (left/right). Can optionally export CSV's along with a summary text file. Default outputs results to stdout. Can either drop the rdiff.ps1 file into your path or copy the contents into your script.
USAGE: rdiff path/to/left,path/to/right [-s path/to/summary/dir]
Here is the gist. Recommended to use version from gist as it may have additional features over time. Feel free to send pull requests.
除了@JNK 的回答之外,您可能希望确保始终使用文件而不是
Compare-Object
的不太直观的输出。您只需要使用-PassThru
开关...这至少意味着您不必担心 SideIndicator 箭头指向哪个方向!
另外,请记住,您可能还想比较 LastWriteTime。
子文件夹
递归地循环子文件夹稍微复杂一些,因为在比较列表之前您可能需要从 FullName 字段中删除相应的根文件夹路径。
您可以通过向您的Folder1和Folder2列表添加一个新的ScriptProperty来完成此操作:
然后您应该能够在比较两个对象时使用RelativePath作为属性,并使用它来连接到“C: \Folder3" 复制时保持文件夹结构不变。
Further to @JNK's answer, you might want to ensure that you are always working with files rather than the less-intuitive output from
Compare-Object
. You just need to use the-PassThru
switch...This at least means you don't have to worry about which way the SideIndicator arrow points!
Also, bear in mind that you might want to compare on LastWriteTime as well.
Sub-folders
Looping through the sub-folders recursively is a little more complicated as you probably will need to strip off the respective root folder paths from the FullName field before comparing lists.
You could do this by adding a new ScriptProperty to your Folder1 and Folder2 lists:
You should then be able to use RelativePath as a property when comparing the two objects and also use that to join on to "C:\Folder3" when copying to keep the folder structure in place.
这是一种可以查找丢失或内容不同的文件的方法。
首先,快速而肮脏的一句台词(见下面的警告)。
在其中一个目录中运行上述命令,并将
$right
设置为(或替换为)另一个目录的路径。$right
中缺少的内容或内容不同的内容将被报告。没有输出意味着没有发现差异。 警告:$right
中存在但左侧缺失的事物将不会被发现/报告。这不会影响计算哈希值;它只是直接比较文件内容。当您想要在另一个上下文中(稍后的日期,在另一台机器上等)引用某些内容时,散列是有意义的,但是当我们直接比较事物时,它只会增加开销。 (理论上两个文件也可能具有相同的哈希值,尽管这基本上不可能偶然发生。另一方面,故意攻击......)
这是一个更合适的脚本,它可以处理更多的极端情况和错误。
Here's an approach which will find files which are missing or differ in content.
First, a quick-and-dirty one-liner (see caveat below).
Run the above in one of the directories, with
$right
set to (or replaced with) the path to the other directory. Things missing from$right
, or which differ in content, will be reported. No output means no differences found. CAVEAT: Things existing in$right
but missing from the left will not be found/reported.This doesn't bother calculating hashes; it just compares the file contents directly. Hashing makes sense when you want to reference something in another context (later date, on another machine, etc.), but when we're comparing things directly, it adds nothing but overhead. (It's also theoretically possible for two files to have the same hash, although that's basically impossible to happen by accident. Deliberate attack, on the other hand...)
Here's a more proper script, which handles more corner cases and errors.
这样做:
甚至递归:
甚至很难忘记:)
Do this:
And even recursively:
and is even hard to forget :)
使用脚本参数的方便版本
简单的文件级比较
调用它就像
PS > .\DirDiff.ps1 -a .\Old\ -b .\New\
可能的输出:
Handy version using script parameter
Simple file-level comparasion
Call it like
PS > .\DirDiff.ps1 -a .\Old\ -b .\New\
Possible output:
gci -path 'C:\Folder' -recurse |where{$_.PSIsContainer}
-recurse 将探索给定根路径下的所有子树,而 .PSIsContainer 属性是您想要测试的属性,以仅获取所有文件夹。您可以仅将 where{!$_.PSIsContainer} 用于文件。
gci -path 'C:\Folder' -recurse |where{$_.PSIsContainer}
-recurse will explore all subtrees below the root path given and the .PSIsContainer property is the one you want to test for to grab all folders only. You can use where{!$_.PSIsContainer} for just files.