有没有一种安全的方法来对两个 zip 压缩文件运行差异?

发布于 2024-07-14 16:38:58 字数 37 浏览 7 评论 0原文

似乎这不是一个确定性的事情,或者有没有办法可靠地做到这一点?

Seems this would not be a deterministic thing, or is there a way to do this reliably?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(13

青衫负雪 2024-07-21 16:38:58

如果您使用 gzip,您可以执行以下操作:

# diff <(zcat file1.gz) <(zcat file2.gz)

If you're using gzip, you can do something like this:

# diff <(zcat file1.gz) <(zcat file2.gz)
酷遇一生 2024-07-21 16:38:58

可靠:解压缩两者,比较。

我不知道这个答案是否足够好供您使用,但它确实有效。

Reliable: unzip both, diff.

I have no idea if that answer's good enough for your use, but it works.

哭了丶谁疼 2024-07-21 16:38:58

一般来说,你无法避免解压然后进行比较。 不同的压缩器将产生不同的 DEFLATE 字节流,而当 INFLATE 时会产生相同的原始文本。 您不能简单地将 DEFLATEd 数据与另一个进行比较。 在某些情况下这会失败。

但在 ZIP 场景中,会为每个条目计算并存储一个 CRC32。 因此,如果您想检查文件,您可以简单地将与每个 DEFLATEd 流关联的存储的 CRC32 与 CRC32 哈希的唯一性属性的警告进行比较。 比较 FileName 和 CRC 可能适合您的需要。

您需要一个 ZIP 库来读取 zip 文件并将这些内容作为“ZipEntry”对象的属性公开。 DotNetZip 将为 .NET 应用程序执行此操作。

In general, you cannot avoid decompressing and then comparing. Different compressors will result in different DEFLATEd byte streams, which when INFLATEd result in the same original text. You cannot simply compare the DEFLATEd data, one to another. That will FAIL in some cases.

But in a ZIP scenario, there is a CRC32 calculated and stored for each entry. So if you want to check files, you can simply compare the stored CRC32 associated to each DEFLATEd stream, with the caveats on the uniqueness properties of the CRC32 hash. It may fit your needs to compare the FileName and the CRC.

You would need a ZIP library that reads zip files and exposes those things as properties on the "ZipEntry" object. DotNetZip will do that for .NET apps.

甜嗑 2024-07-21 16:38:58

zipcmp 比较 zip 存档 zip1 和 zip2,并比较它们的名称、未压缩大小和 CRC,检查它们是否包含相同的文件。 文件顺序和压缩大小差异将被忽略。

sudo apt-get install zipcmp

zipcmp compares the zip archives zip1 and zip2 and checks if they contain the same files, comparing their names, uncompressed sizes, and CRCs. File order and compressed size differences are ignored.

sudo apt-get install zipcmp

海之角 2024-07-21 16:38:58

实际上,gzip 和 bzip2 都带有专用工具来执行此操作。

使用 gzip:

$ zdiff file1.gz file2.gz

使用 bzip2:

$ bzdiff file1.bz2 file2.bz2

但请记住,对于非常大的文件,您可能会遇到内存问题(我最初来这里是为了了解如何解决它们,所以我还没有答案)。

Actually gzip and bzip2 both come with dedicated tools for doing that.

With gzip:

$ zdiff file1.gz file2.gz

With bzip2:

$ bzdiff file1.bz2 file2.bz2

But keep in mind that for very large files, you might run into memory issues (I originally came here to find out about how to solve them, so I don't have the answer yet).

一刻暧昧 2024-07-21 16:38:58

超越比较对此没有问题。

Beyond compare has no problem with this.

辞旧 2024-07-21 16:38:58

这不是特别优雅,但您可以使用 Mac OS X 开发人员工具附带的 FileMerge 应用程序,通过自定义过滤器来比较 zip 文件的内容。

创建一个脚本 ~/bin/zip_filemerge_filter.bash,其内容为:

#!/bin/bash
##
#  List the size, CR-32 checksum, and file path of each file in a zip archive,
#  sorted in order by file path.
##
unzip -v -l "${1}" | cut -c 1-9,59-,49-57 | sort -k3
exit $?

使脚本可执行 (chmod +x ~/bin/zip_filemerge_filter.bash)。

打开 FileMerge,打开首选项,然后转到“过滤器”选项卡。 使用以下命令将项目添加到列表中:
扩展名:“zip”,过滤器:“~/bin/zip_filemerge_filter.bash $(FILE)”,显示:已过滤,应用*:否。(我还为 .jar 和 .war 文件添加了过滤器。)

然后使用FileMerge(或命令行“opendiff”包装器)用于比较两个 .zip 文件。

这不会让您区分 zip 存档中的文件内容,但可以让您快速查看哪些文件出现在一个唯一的存档中,哪些文件同时存在于两个存档中但具有不同的内容(即不同的大小和/或校验和)。

This isn't particularly elegant, but you can use the FileMerge application that comes with Mac OS X developer tools to compare the contents of zip files using a custom filter.

Create a script ~/bin/zip_filemerge_filter.bash with contents:

#!/bin/bash
##
#  List the size, CR-32 checksum, and file path of each file in a zip archive,
#  sorted in order by file path.
##
unzip -v -l "${1}" | cut -c 1-9,59-,49-57 | sort -k3
exit $?

Make the script executable (chmod +x ~/bin/zip_filemerge_filter.bash).

Open FileMerge, open the Preferences, and go to the "Filters" tab. Add an item to the list with:
Extension:"zip", Filter:"~/bin/zip_filemerge_filter.bash $(FILE)", Display: Filtered, Apply*: No. (I've also added the filer for .jar and .war files.)

Then use FileMerge (or the command line "opendiff" wrapper) to compare two .zip files.

This won't let you diff the contents of files within the zip archives, but will let you quickly see which files appear within one only archive and which files exist in both but have different content (i.e. different size and/or checksum).

北方。的韩爷 2024-07-21 16:38:58

这个简单的 Perl 脚本得到了缓解: diffzips.pl

我通过 原始 zip,对于不同的 Java 包格式特别有用:jar、war 和 Ear。

zipcmp 使用更简单的方法,并且不会递归到存档的 zip 中。

I found relief with this simple Perl script: diffzips.pl

It recursively diffs every zip file inside the original zip, which is especially useful for different Java package formats: jar, war, and ear.

zipcmp uses more simple approach and it doesn't recurse into archived zips.

还给你自由 2024-07-21 16:38:58

WinMerge(仅限 Windows)有很多 功能 其中之一是:

  • 使用 7-Zip 归档文件支持

WinMerge (windows only) has lots of features and one of them is:

  • Archive file support using 7-Zip
不一样的天空 2024-07-21 16:38:58

zip 文件的 python 解决方案:

import difflib
import zipfile

def diff(filename1, filename2):
    differs = False

    z1 = zipfile.ZipFile(open(filename1))
    z2 = zipfile.ZipFile(open(filename2))
    if len(z1.infolist()) != len(z2.infolist()):
        print "number of archive elements differ: {} in {} vs {} in {}".format(
            len(z1.infolist()), z1.filename, len(z2.infolist()), z2.filename)
        return 1
    for zipentry in z1.infolist():
        if zipentry.filename not in z2.namelist():
            print "no file named {} found in {}".format(zipentry.filename,
                                                        z2.filename)
            differs = True
        else:
            diff = difflib.ndiff(z1.open(zipentry.filename),
                                 z2.open(zipentry.filename))
            delta = ''.join(x[2:] for x in diff
                            if x.startswith('- ') or x.startswith('+ '))
            if delta:
                differs = True
                print "content for {} differs:\n{}".format(
                    zipentry.filename, delta)
    if not differs:
        print "all files are the same"
        return 0
    return 1

Use as

diff(filename1, filename2)

It 会在内存中逐行比较文件并显示更改。

A python solution for zip files:

import difflib
import zipfile

def diff(filename1, filename2):
    differs = False

    z1 = zipfile.ZipFile(open(filename1))
    z2 = zipfile.ZipFile(open(filename2))
    if len(z1.infolist()) != len(z2.infolist()):
        print "number of archive elements differ: {} in {} vs {} in {}".format(
            len(z1.infolist()), z1.filename, len(z2.infolist()), z2.filename)
        return 1
    for zipentry in z1.infolist():
        if zipentry.filename not in z2.namelist():
            print "no file named {} found in {}".format(zipentry.filename,
                                                        z2.filename)
            differs = True
        else:
            diff = difflib.ndiff(z1.open(zipentry.filename),
                                 z2.open(zipentry.filename))
            delta = ''.join(x[2:] for x in diff
                            if x.startswith('- ') or x.startswith('+ '))
            if delta:
                differs = True
                print "content for {} differs:\n{}".format(
                    zipentry.filename, delta)
    if not differs:
        print "all files are the same"
        return 0
    return 1

Use as

diff(filename1, filename2)

It compares files line-by-line in memory and shows changes.

漆黑的白昼 2024-07-21 16:38:58

我放弃了尝试使用现有的工具,并编写了一个适合我的小 bash 脚本:

#!/bin/bash
# Author: Onno Benschop, [email protected]
# Note: This requires enough space for both archives to be extracted in the tempdir

if [ $# -ne 2 ] ; then
  echo Usage: $(basename "$0") zip1 zip2
  exit
fi

# Make temporary directories
archive_1=$(mktemp -d)
archive_2=$(mktemp -d)

# Unzip the archives
unzip -qqd"${archive_1}" "$1"
unzip -qqd"${archive_2}" "$2"

# Compare them
diff -r "${archive_1}" "${archive_2}"

# Remove the temporary directories
rm -rf "${archive_1}" "${archive_2}"

I gave up trying to use existing tools and wrote a little bash script that works for me:

#!/bin/bash
# Author: Onno Benschop, [email protected]
# Note: This requires enough space for both archives to be extracted in the tempdir

if [ $# -ne 2 ] ; then
  echo Usage: $(basename "$0") zip1 zip2
  exit
fi

# Make temporary directories
archive_1=$(mktemp -d)
archive_2=$(mktemp -d)

# Unzip the archives
unzip -qqd"${archive_1}" "$1"
unzip -qqd"${archive_2}" "$2"

# Compare them
diff -r "${archive_1}" "${archive_2}"

# Remove the temporary directories
rm -rf "${archive_1}" "${archive_2}"
木落 2024-07-21 16:38:58

这里的许多解决方案要么只是检查 CRC 以查看是否存在差异,要么是复杂的脚本,需要解压缩到磁盘,使用外部程序,要么需要除您所要求的压缩格式之外的特定压缩格式关于(zcat 不适用于 zip)。

这是一个简单、易于阅读的内容,并且应该在任何有 bash 的地方都可以工作,显示文件内容之间的差异如果像我一样,这就是您遇到这个问题时所需要的

diff \
    <(zipinfo -1 "$zip1" '*' \
    | grep '[^/]

这会解压缩-内存,而不是磁盘,在比较时从管道中释放数据(它不会解压缩并然后进行比较,因此不应使用太多内存)。
想要更改差异选项以忽略空格或并排使用? 将 diff 更改为 diff -wgvimdiff (这会将所有文件保留在内存中)等等。
假设您只想比较 .js 文件? 将 * 更改为 *.js
只想查看其中之一缺少的文件名? 删除 while 行,就不会打扰解压缩。

简单。

它甚至可以安全地处理(跳过并将其记录到stderr)带有“非法”字符(例如换行符和反斜杠)的文件名。
没有比这更“安全”的了。

slm 的答案非常适合返回不同的文件(不显示差异),甚至根本不解压缩,这很好。 如果出于某种原因您希望比 CRC 更进一步,则在此答案中您可以添加 | 之前的 sha512sum; 例如,完成 并得到“两全其美”:P


同样,比较存档和真实目录相对容易:

diff \
    <(zipinfo -1 "$zip" '*' \
    | grep '[^/]

或者,仅忽略目录中的文件,基本上是一个方便的空运行 unzip -o -d "$directory"

diff \
    <(zipinfo -1 "$zip" '*' \
    | grep '[^/]

Windows? 对不起。 虽然脚本很简单,并且很容易移植到[语法上]出色的 powershell,但它不起作用。 本机 cmdlet 仅限提取到磁盘,MS 仍然尚未修复 PS 中的二进制数据管道已损坏,因此您也无法以这种方式“安全”使用外部 zip.exe

显然其他人已经直接使用 .NET API 完成了类似的事情,但它不再是一个优雅的端口,而更多的是.NET 中的重新实现:|


<子>
关于之前提到的“非法文件名”的注释:
如果您希望它能够与这些一起工作,实际上并不太困难; 您只需将 $file$(echo "$file" | sed 's/\\/\\\\/g;s/\^J/\n /g;s/\^M/\r/g').

添加其他ctrl chars 当你遇到它们时。

原因是,出于某种原因,即使 zipinfo 显示带有 \n 其中为 ^J,它不会接受这些用于 unzip 的安全名称,只接受原始名称! 即使它可以使用 unzip -^< 提取到那些非法文件名/a> 根本无法通过 zipinfo 获取这些原始文件名。 因此,您需要从安全的、不可用的文件名中构建原始的、非法的文件名来引用它们以进行差异:(
如果执行此操作,请注意,无法区分字面上的 ^J 和显示为 ^J\n,并且 zip 不会根本不支持文件名中的 /^@


作为奖励; 您可以将所有这些差异直接写入存档,并将它们全部保存在与原始文件匹配的文件夹层次结构中,而不是尝试在一个大文件中一次读取所有内容。

(zipinfo -1 "$zip1"; zipinfo -1 "$zip2") \
    | grep '[^/]

不是一个漂亮的脚本,但现在您可以在选择的 gui 存档器中打开它,或者执行 unzip -p diff.zip some/dir/some.file 来具体查看与该文件的差异,或者如果没有差异,就会收到“未找到”的欢迎信息,这在实践中要漂亮得多。

\ | sort \ | while IFS= read -r file; do unzip -c "$zip1" "$file"; done \ ) \ <(zipinfo -1 "$zip2" '*' \ | grep '[^/]

这会解压缩-内存,而不是磁盘,在比较时从管道中释放数据(它不会解压缩并然后进行比较,因此不应使用太多内存)。
想要更改差异选项以忽略空格或并排使用? 将 diff 更改为 diff -wgvimdiff (这会将所有文件保留在内存中)等等。
假设您只想比较 .js 文件? 将 * 更改为 *.js
只想查看其中之一缺少的文件名? 删除 while 行,就不会打扰解压缩。

简单。

它甚至可以安全地处理(跳过并将其记录到stderr)带有“非法”字符(例如换行符和反斜杠)的文件名。
没有比这更“安全”的了。

slm 的答案非常适合返回不同的文件(不显示差异),甚至根本不解压缩,这很好。 如果出于某种原因您希望比 CRC 更进一步,则在此答案中您可以添加 | 之前的 sha512sum; 例如,完成 并得到“两全其美”:P


同样,比较存档和真实目录相对容易:


或者,仅忽略目录中的文件,基本上是一个方便的空运行 unzip -o -d "$directory"



Windows? 对不起。 虽然脚本很简单,并且很容易移植到[语法上]出色的 powershell,但它不起作用。 本机 cmdlet 仅限提取到磁盘,MS 仍然尚未修复 PS 中的二进制数据管道已损坏,因此您也无法以这种方式“安全”使用外部 zip.exe

显然其他人已经直接使用 .NET API 完成了类似的事情,但它不再是一个优雅的端口,而更多的是.NET 中的重新实现:|


<子>
关于之前提到的“非法文件名”的注释:
如果您希望它能够与这些一起工作,实际上并不太困难; 您只需将 $file$(echo "$file" | sed 's/\\/\\\\/g;s/\^J/\n /g;s/\^M/\r/g').

添加其他ctrl chars 当你遇到它们时。

原因是,出于某种原因,即使 zipinfo 显示带有 \n 其中为 ^J,它不会接受这些用于 unzip 的安全名称,只接受原始名称! 即使它可以使用 unzip -^< 提取到那些非法文件名/a> 根本无法通过 zipinfo 获取这些原始文件名。 因此,您需要从安全的、不可用的文件名中构建原始的、非法的文件名来引用它们以进行差异:(
如果执行此操作,请注意,无法区分字面上的 ^J 和显示为 ^J\n,并且 zip 不会根本不支持文件名中的 /^@


作为奖励; 您可以将所有这些差异直接写入存档,并将它们全部保存在与原始文件匹配的文件夹层次结构中,而不是尝试在一个大文件中一次读取所有内容。


不是一个漂亮的脚本,但现在您可以在选择的 gui 存档器中打开它,或者执行 unzip -p diff.zip some/dir/some.file 来具体查看与该文件的差异,或者如果没有差异,就会收到“未找到”的欢迎信息,这在实践中要漂亮得多。

\ | sort \ | while IFS= read -r file; do unzip -c "$zip2" "$file"; done \ )

这会解压缩-内存,而不是磁盘,在比较时从管道中释放数据(它不会解压缩并然后进行比较,因此不应使用太多内存)。
想要更改差异选项以忽略空格或并排使用? 将 diff 更改为 diff -wgvimdiff (这会将所有文件保留在内存中)等等。
假设您只想比较 .js 文件? 将 * 更改为 *.js
只想查看其中之一缺少的文件名? 删除 while 行,就不会打扰解压缩。

简单。

它甚至可以安全地处理(跳过并将其记录到stderr)带有“非法”字符(例如换行符和反斜杠)的文件名。
没有比这更“安全”的了。

slm 的答案非常适合返回不同的文件(不显示差异),甚至根本不解压缩,这很好。 如果出于某种原因您希望比 CRC 更进一步,则在此答案中您可以添加 | 之前的 sha512sum; 例如,完成 并得到“两全其美”:P


同样,比较存档和真实目录相对容易:


或者,仅忽略目录中的文件,基本上是一个方便的空运行 unzip -o -d "$directory"



Windows? 对不起。 虽然脚本很简单,并且很容易移植到[语法上]出色的 powershell,但它不起作用。 本机 cmdlet 仅限提取到磁盘,MS 仍然尚未修复 PS 中的二进制数据管道已损坏,因此您也无法以这种方式“安全”使用外部 zip.exe

显然其他人已经直接使用 .NET API 完成了类似的事情,但它不再是一个优雅的端口,而更多的是.NET 中的重新实现:|


<子>
关于之前提到的“非法文件名”的注释:
如果您希望它能够与这些一起工作,实际上并不太困难; 您只需将 $file$(echo "$file" | sed 's/\\/\\\\/g;s/\^J/\n /g;s/\^M/\r/g').

添加其他ctrl chars 当你遇到它们时。

原因是,出于某种原因,即使 zipinfo 显示带有 \n 其中为 ^J,它不会接受这些用于 unzip 的安全名称,只接受原始名称! 即使它可以使用 unzip -^< 提取到那些非法文件名/a> 根本无法通过 zipinfo 获取这些原始文件名。 因此,您需要从安全的、不可用的文件名中构建原始的、非法的文件名来引用它们以进行差异:(
如果执行此操作,请注意,无法区分字面上的 ^J 和显示为 ^J\n,并且 zip 不会根本不支持文件名中的 /^@


作为奖励; 您可以将所有这些差异直接写入存档,并将它们全部保存在与原始文件匹配的文件夹层次结构中,而不是尝试在一个大文件中一次读取所有内容。


不是一个漂亮的脚本,但现在您可以在选择的 gui 存档器中打开它,或者执行 unzip -p diff.zip some/dir/some.file 来具体查看与该文件的差异,或者如果没有差异,就会收到“未找到”的欢迎信息,这在实践中要漂亮得多。

\ | sort \ | while IFS= read -r file; do unzip -c "$zip" "$file"; done \ ) \ <(find "$directory" -type f -name '*' \ | sort \ | while IFS= read -r file do printf 'Archive: %s\n inflating: %s\n' "$directory" `echo $file | sed "s|$directory/||"` cat "$file" echo done \ )

或者,仅忽略目录中的文件,基本上是一个方便的空运行 unzip -o -d "$directory"



Windows? 对不起。 虽然脚本很简单,并且很容易移植到[语法上]出色的 powershell,但它不起作用。 本机 cmdlet 仅限提取到磁盘,MS 仍然尚未修复 PS 中的二进制数据管道已损坏,因此您也无法以这种方式“安全”使用外部 zip.exe

显然其他人已经直接使用 .NET API 完成了类似的事情,但它不再是一个优雅的端口,而更多的是.NET 中的重新实现:|


<子>
关于之前提到的“非法文件名”的注释:
如果您希望它能够与这些一起工作,实际上并不太困难; 您只需将 $file$(echo "$file" | sed 's/\\/\\\\/g;s/\^J/\n /g;s/\^M/\r/g').

添加其他ctrl chars 当你遇到它们时。

原因是,出于某种原因,即使 zipinfo 显示带有 \n 其中为 ^J,它不会接受这些用于 unzip 的安全名称,只接受原始名称! 即使它可以使用 unzip -^< 提取到那些非法文件名/a> 根本无法通过 zipinfo 获取这些原始文件名。 因此,您需要从安全的、不可用的文件名中构建原始的、非法的文件名来引用它们以进行差异:(
如果执行此操作,请注意,无法区分字面上的 ^J 和显示为 ^J\n,并且 zip 不会根本不支持文件名中的 /^@


作为奖励; 您可以将所有这些差异直接写入存档,并将它们全部保存在与原始文件匹配的文件夹层次结构中,而不是尝试在一个大文件中一次读取所有内容。


不是一个漂亮的脚本,但现在您可以在选择的 gui 存档器中打开它,或者执行 unzip -p diff.zip some/dir/some.file 来具体查看与该文件的差异,或者如果没有差异,就会收到“未找到”的欢迎信息,这在实践中要漂亮得多。

\ | sort \ | while IFS= read -r file; do unzip -c "$zip1" "$file"; done \ ) \ <(zipinfo -1 "$zip2" '*' \ | grep '[^/]

这会解压缩-内存,而不是磁盘,在比较时从管道中释放数据(它不会解压缩并然后进行比较,因此不应使用太多内存)。
想要更改差异选项以忽略空格或并排使用? 将 diff 更改为 diff -wgvimdiff (这会将所有文件保留在内存中)等等。
假设您只想比较 .js 文件? 将 * 更改为 *.js
只想查看其中之一缺少的文件名? 删除 while 行,就不会打扰解压缩。

简单。

它甚至可以安全地处理(跳过并将其记录到stderr)带有“非法”字符(例如换行符和反斜杠)的文件名。
没有比这更“安全”的了。

slm 的答案非常适合返回不同的文件(不显示差异),甚至根本不解压缩,这很好。 如果出于某种原因您希望比 CRC 更进一步,则在此答案中您可以添加 | 之前的 sha512sum; 例如,完成 并得到“两全其美”:P


同样,比较存档和真实目录相对容易:


或者,仅忽略目录中的文件,基本上是一个方便的空运行 unzip -o -d "$directory"



Windows? 对不起。 虽然脚本很简单,并且很容易移植到[语法上]出色的 powershell,但它不起作用。 本机 cmdlet 仅限提取到磁盘,MS 仍然尚未修复 PS 中的二进制数据管道已损坏,因此您也无法以这种方式“安全”使用外部 zip.exe

显然其他人已经直接使用 .NET API 完成了类似的事情,但它不再是一个优雅的端口,而更多的是.NET 中的重新实现:|


<子>
关于之前提到的“非法文件名”的注释:
如果您希望它能够与这些一起工作,实际上并不太困难; 您只需将 $file$(echo "$file" | sed 's/\\/\\\\/g;s/\^J/\n /g;s/\^M/\r/g').

添加其他ctrl chars 当你遇到它们时。

原因是,出于某种原因,即使 zipinfo 显示带有 \n 其中为 ^J,它不会接受这些用于 unzip 的安全名称,只接受原始名称! 即使它可以使用 unzip -^< 提取到那些非法文件名/a> 根本无法通过 zipinfo 获取这些原始文件名。 因此,您需要从安全的、不可用的文件名中构建原始的、非法的文件名来引用它们以进行差异:(
如果执行此操作,请注意,无法区分字面上的 ^J 和显示为 ^J\n,并且 zip 不会根本不支持文件名中的 /^@


作为奖励; 您可以将所有这些差异直接写入存档,并将它们全部保存在与原始文件匹配的文件夹层次结构中,而不是尝试在一个大文件中一次读取所有内容。


不是一个漂亮的脚本,但现在您可以在选择的 gui 存档器中打开它,或者执行 unzip -p diff.zip some/dir/some.file 来具体查看与该文件的差异,或者如果没有差异,就会收到“未找到”的欢迎信息,这在实践中要漂亮得多。

\ | sort \ | while IFS= read -r file; do unzip -c "$zip2" "$file"; done \ )

这会解压缩-内存,而不是磁盘,在比较时从管道中释放数据(它不会解压缩并然后进行比较,因此不应使用太多内存)。
想要更改差异选项以忽略空格或并排使用? 将 diff 更改为 diff -wgvimdiff (这会将所有文件保留在内存中)等等。
假设您只想比较 .js 文件? 将 * 更改为 *.js
只想查看其中之一缺少的文件名? 删除 while 行,就不会打扰解压缩。

简单。

它甚至可以安全地处理(跳过并将其记录到stderr)带有“非法”字符(例如换行符和反斜杠)的文件名。
没有比这更“安全”的了。

slm 的答案非常适合返回不同的文件(不显示差异),甚至根本不解压缩,这很好。 如果出于某种原因您希望比 CRC 更进一步,则在此答案中您可以添加 | 之前的 sha512sum; 例如,完成 并得到“两全其美”:P


同样,比较存档和真实目录相对容易:


或者,仅忽略目录中的文件,基本上是一个方便的空运行 unzip -o -d "$directory"



Windows? 对不起。 虽然脚本很简单,并且很容易移植到[语法上]出色的 powershell,但它不起作用。 本机 cmdlet 仅限提取到磁盘,MS 仍然尚未修复 PS 中的二进制数据管道已损坏,因此您也无法以这种方式“安全”使用外部 zip.exe

显然其他人已经直接使用 .NET API 完成了类似的事情,但它不再是一个优雅的端口,而更多的是.NET 中的重新实现:|


<子>
关于之前提到的“非法文件名”的注释:
如果您希望它能够与这些一起工作,实际上并不太困难; 您只需将 $file$(echo "$file" | sed 's/\\/\\\\/g;s/\^J/\n /g;s/\^M/\r/g').

添加其他ctrl chars 当你遇到它们时。

原因是,出于某种原因,即使 zipinfo 显示带有 \n 其中为 ^J,它不会接受这些用于 unzip 的安全名称,只接受原始名称! 即使它可以使用 unzip -^< 提取到那些非法文件名/a> 根本无法通过 zipinfo 获取这些原始文件名。 因此,您需要从安全的、不可用的文件名中构建原始的、非法的文件名来引用它们以进行差异:(
如果执行此操作,请注意,无法区分字面上的 ^J 和显示为 ^J\n,并且 zip 不会根本不支持文件名中的 /^@


作为奖励; 您可以将所有这些差异直接写入存档,并将它们全部保存在与原始文件匹配的文件夹层次结构中,而不是尝试在一个大文件中一次读取所有内容。


不是一个漂亮的脚本,但现在您可以在选择的 gui 存档器中打开它,或者执行 unzip -p diff.zip some/dir/some.file 来具体查看与该文件的差异,或者如果没有差异,就会收到“未找到”的欢迎信息,这在实践中要漂亮得多。

\ | sort \ | while IFS= read -r file; do unzip -c "$zip" "$file"; done \ ) \ <(zipinfo -1 "$zip" '*' \ | grep '[^/]

Windows? 对不起。 虽然脚本很简单,并且很容易移植到[语法上]出色的 powershell,但它不起作用。 本机 cmdlet 仅限提取到磁盘,MS 仍然尚未修复 PS 中的二进制数据管道已损坏,因此您也无法以这种方式“安全”使用外部 zip.exe

显然其他人已经直接使用 .NET API 完成了类似的事情,但它不再是一个优雅的端口,而更多的是.NET 中的重新实现:|


<子>
关于之前提到的“非法文件名”的注释:
如果您希望它能够与这些一起工作,实际上并不太困难; 您只需将 $file$(echo "$file" | sed 's/\\/\\\\/g;s/\^J/\n /g;s/\^M/\r/g').

添加其他ctrl chars 当你遇到它们时。

原因是,出于某种原因,即使 zipinfo 显示带有 \n 其中为 ^J,它不会接受这些用于 unzip 的安全名称,只接受原始名称! 即使它可以使用 unzip -^< 提取到那些非法文件名/a> 根本无法通过 zipinfo 获取这些原始文件名。 因此,您需要从安全的、不可用的文件名中构建原始的、非法的文件名来引用它们以进行差异:(
如果执行此操作,请注意,无法区分字面上的 ^J 和显示为 ^J\n,并且 zip 不会根本不支持文件名中的 /^@


作为奖励; 您可以将所有这些差异直接写入存档,并将它们全部保存在与原始文件匹配的文件夹层次结构中,而不是尝试在一个大文件中一次读取所有内容。


不是一个漂亮的脚本,但现在您可以在选择的 gui 存档器中打开它,或者执行 unzip -p diff.zip some/dir/some.file 来具体查看与该文件的差异,或者如果没有差异,就会收到“未找到”的欢迎信息,这在实践中要漂亮得多。

\ | sort \ | while IFS= read -r file; do unzip -c "$zip1" "$file"; done \ ) \ <(zipinfo -1 "$zip2" '*' \ | grep '[^/]

这会解压缩-内存,而不是磁盘,在比较时从管道中释放数据(它不会解压缩并然后进行比较,因此不应使用太多内存)。
想要更改差异选项以忽略空格或并排使用? 将 diff 更改为 diff -wgvimdiff (这会将所有文件保留在内存中)等等。
假设您只想比较 .js 文件? 将 * 更改为 *.js
只想查看其中之一缺少的文件名? 删除 while 行,就不会打扰解压缩。

简单。

它甚至可以安全地处理(跳过并将其记录到stderr)带有“非法”字符(例如换行符和反斜杠)的文件名。
没有比这更“安全”的了。

slm 的答案非常适合返回不同的文件(不显示差异),甚至根本不解压缩,这很好。 如果出于某种原因您希望比 CRC 更进一步,则在此答案中您可以添加 | 之前的 sha512sum; 例如,完成 并得到“两全其美”:P


同样,比较存档和真实目录相对容易:


或者,仅忽略目录中的文件,基本上是一个方便的空运行 unzip -o -d "$directory"



Windows? 对不起。 虽然脚本很简单,并且很容易移植到[语法上]出色的 powershell,但它不起作用。 本机 cmdlet 仅限提取到磁盘,MS 仍然尚未修复 PS 中的二进制数据管道已损坏,因此您也无法以这种方式“安全”使用外部 zip.exe

显然其他人已经直接使用 .NET API 完成了类似的事情,但它不再是一个优雅的端口,而更多的是.NET 中的重新实现:|


<子>
关于之前提到的“非法文件名”的注释:
如果您希望它能够与这些一起工作,实际上并不太困难; 您只需将 $file$(echo "$file" | sed 's/\\/\\\\/g;s/\^J/\n /g;s/\^M/\r/g').

添加其他ctrl chars 当你遇到它们时。

原因是,出于某种原因,即使 zipinfo 显示带有 \n 其中为 ^J,它不会接受这些用于 unzip 的安全名称,只接受原始名称! 即使它可以使用 unzip -^< 提取到那些非法文件名/a> 根本无法通过 zipinfo 获取这些原始文件名。 因此,您需要从安全的、不可用的文件名中构建原始的、非法的文件名来引用它们以进行差异:(
如果执行此操作,请注意,无法区分字面上的 ^J 和显示为 ^J\n,并且 zip 不会根本不支持文件名中的 /^@


作为奖励; 您可以将所有这些差异直接写入存档,并将它们全部保存在与原始文件匹配的文件夹层次结构中,而不是尝试在一个大文件中一次读取所有内容。


不是一个漂亮的脚本,但现在您可以在选择的 gui 存档器中打开它,或者执行 unzip -p diff.zip some/dir/some.file 来具体查看与该文件的差异,或者如果没有差异,就会收到“未找到”的欢迎信息,这在实践中要漂亮得多。

\ | sort \ | while IFS= read -r file; do unzip -c "$zip2" "$file"; done \ )

这会解压缩-内存,而不是磁盘,在比较时从管道中释放数据(它不会解压缩并然后进行比较,因此不应使用太多内存)。
想要更改差异选项以忽略空格或并排使用? 将 diff 更改为 diff -wgvimdiff (这会将所有文件保留在内存中)等等。
假设您只想比较 .js 文件? 将 * 更改为 *.js
只想查看其中之一缺少的文件名? 删除 while 行,就不会打扰解压缩。

简单。

它甚至可以安全地处理(跳过并将其记录到stderr)带有“非法”字符(例如换行符和反斜杠)的文件名。
没有比这更“安全”的了。

slm 的答案非常适合返回不同的文件(不显示差异),甚至根本不解压缩,这很好。 如果出于某种原因您希望比 CRC 更进一步,则在此答案中您可以添加 | 之前的 sha512sum; 例如,完成 并得到“两全其美”:P


同样,比较存档和真实目录相对容易:

或者,仅忽略目录中的文件,基本上是一个方便的空运行 unzip -o -d "$directory"


Windows? 对不起。 虽然脚本很简单,并且很容易移植到[语法上]出色的 powershell,但它不起作用。 本机 cmdlet 仅限提取到磁盘,MS 仍然尚未修复 PS 中的二进制数据管道已损坏,因此您也无法以这种方式“安全”使用外部 zip.exe

显然其他人已经直接使用 .NET API 完成了类似的事情,但它不再是一个优雅的端口,而更多的是.NET 中的重新实现:|


<子>
关于之前提到的“非法文件名”的注释:
如果您希望它能够与这些一起工作,实际上并不太困难; 您只需将 $file$(echo "$file" | sed 's/\\/\\\\/g;s/\^J/\n /g;s/\^M/\r/g').

添加其他ctrl chars 当你遇到它们时。

原因是,出于某种原因,即使 zipinfo 显示带有 \n 其中为 ^J,它不会接受这些用于 unzip 的安全名称,只接受原始名称! 即使它可以使用 unzip -^< 提取到那些非法文件名/a> 根本无法通过 zipinfo 获取这些原始文件名。 因此,您需要从安全的、不可用的文件名中构建原始的、非法的文件名来引用它们以进行差异:(
如果执行此操作,请注意,无法区分字面上的 ^J 和显示为 ^J\n,并且 zip 不会根本不支持文件名中的 /^@


作为奖励; 您可以将所有这些差异直接写入存档,并将它们全部保存在与原始文件匹配的文件夹层次结构中,而不是尝试在一个大文件中一次读取所有内容。

不是一个漂亮的脚本,但现在您可以在选择的 gui 存档器中打开它,或者执行 unzip -p diff.zip some/dir/some.file 来具体查看与该文件的差异,或者如果没有差异,就会收到“未找到”的欢迎信息,这在实践中要漂亮得多。

\ | sort \ | while IFS= read -r file; do unzip -c "$zip" "$file"; done \ ) \ <(find "$directory" -type f -name '*' \ | sort \ | while IFS= read -r file do printf 'Archive: %s\n inflating: %s\n' "$directory" `echo $file | sed "s|$directory/||"` cat "$file" echo done \ )

或者,仅忽略目录中的文件,基本上是一个方便的空运行 unzip -o -d "$directory"


Windows? 对不起。 虽然脚本很简单,并且很容易移植到[语法上]出色的 powershell,但它不起作用。 本机 cmdlet 仅限提取到磁盘,MS 仍然尚未修复 PS 中的二进制数据管道已损坏,因此您也无法以这种方式“安全”使用外部 zip.exe

显然其他人已经直接使用 .NET API 完成了类似的事情,但它不再是一个优雅的端口,而更多的是.NET 中的重新实现:|


<子>
关于之前提到的“非法文件名”的注释:
如果您希望它能够与这些一起工作,实际上并不太困难; 您只需将 $file$(echo "$file" | sed 's/\\/\\\\/g;s/\^J/\n /g;s/\^M/\r/g').

添加其他ctrl chars 当你遇到它们时。

原因是,出于某种原因,即使 zipinfo 显示带有 \n 其中为 ^J,它不会接受这些用于 unzip 的安全名称,只接受原始名称! 即使它可以使用 unzip -^< 提取到那些非法文件名/a> 根本无法通过 zipinfo 获取这些原始文件名。 因此,您需要从安全的、不可用的文件名中构建原始的、非法的文件名来引用它们以进行差异:(
如果执行此操作,请注意,无法区分字面上的 ^J 和显示为 ^J\n,并且 zip 不会根本不支持文件名中的 /^@


作为奖励; 您可以将所有这些差异直接写入存档,并将它们全部保存在与原始文件匹配的文件夹层次结构中,而不是尝试在一个大文件中一次读取所有内容。

不是一个漂亮的脚本,但现在您可以在选择的 gui 存档器中打开它,或者执行 unzip -p diff.zip some/dir/some.file 来具体查看与该文件的差异,或者如果没有差异,就会收到“未找到”的欢迎信息,这在实践中要漂亮得多。

\ | sort \ | while IFS= read -r file; do unzip -c "$zip1" "$file"; done \ ) \ <(zipinfo -1 "$zip2" '*' \ | grep '[^/]

这会解压缩-内存,而不是磁盘,在比较时从管道中释放数据(它不会解压缩并然后进行比较,因此不应使用太多内存)。
想要更改差异选项以忽略空格或并排使用? 将 diff 更改为 diff -wgvimdiff (这会将所有文件保留在内存中)等等。
假设您只想比较 .js 文件? 将 * 更改为 *.js
只想查看其中之一缺少的文件名? 删除 while 行,就不会打扰解压缩。

简单。

它甚至可以安全地处理(跳过并将其记录到stderr)带有“非法”字符(例如换行符和反斜杠)的文件名。
没有比这更“安全”的了。

slm 的答案非常适合返回不同的文件(不显示差异),甚至根本不解压缩,这很好。 如果出于某种原因您希望比 CRC 更进一步,则在此答案中您可以添加 | 之前的 sha512sum; 例如,完成 并得到“两全其美”:P


同样,比较存档和真实目录相对容易:

或者,仅忽略目录中的文件,基本上是一个方便的空运行 unzip -o -d "$directory"


Windows? 对不起。 虽然脚本很简单,并且很容易移植到[语法上]出色的 powershell,但它不起作用。 本机 cmdlet 仅限提取到磁盘,MS 仍然尚未修复 PS 中的二进制数据管道已损坏,因此您也无法以这种方式“安全”使用外部 zip.exe

显然其他人已经直接使用 .NET API 完成了类似的事情,但它不再是一个优雅的端口,而更多的是.NET 中的重新实现:|


<子>
关于之前提到的“非法文件名”的注释:
如果您希望它能够与这些一起工作,实际上并不太困难; 您只需将 $file$(echo "$file" | sed 's/\\/\\\\/g;s/\^J/\n /g;s/\^M/\r/g').

添加其他ctrl chars 当你遇到它们时。

原因是,出于某种原因,即使 zipinfo 显示带有 \n 其中为 ^J,它不会接受这些用于 unzip 的安全名称,只接受原始名称! 即使它可以使用 unzip -^< 提取到那些非法文件名/a> 根本无法通过 zipinfo 获取这些原始文件名。 因此,您需要从安全的、不可用的文件名中构建原始的、非法的文件名来引用它们以进行差异:(
如果执行此操作,请注意,无法区分字面上的 ^J 和显示为 ^J\n,并且 zip 不会根本不支持文件名中的 /^@


作为奖励; 您可以将所有这些差异直接写入存档,并将它们全部保存在与原始文件匹配的文件夹层次结构中,而不是尝试在一个大文件中一次读取所有内容。

不是一个漂亮的脚本,但现在您可以在选择的 gui 存档器中打开它,或者执行 unzip -p diff.zip some/dir/some.file 来具体查看与该文件的差异,或者如果没有差异,就会收到“未找到”的欢迎信息,这在实践中要漂亮得多。

\ | sort \ | while IFS= read -r file; do unzip -c "$zip2" "$file"; done \ )

这会解压缩-内存,而不是磁盘,在比较时从管道中释放数据(它不会解压缩并然后进行比较,因此不应使用太多内存)。
想要更改差异选项以忽略空格或并排使用? 将 diff 更改为 diff -wgvimdiff (这会将所有文件保留在内存中)等等。
假设您只想比较 .js 文件? 将 * 更改为 *.js
只想查看其中之一缺少的文件名? 删除 while 行,就不会打扰解压缩。

简单。

它甚至可以安全地处理(跳过并将其记录到stderr)带有“非法”字符(例如换行符和反斜杠)的文件名。
没有比这更“安全”的了。

slm 的答案非常适合返回不同的文件(不显示差异),甚至根本不解压缩,这很好。 如果出于某种原因您希望比 CRC 更进一步,则在此答案中您可以添加 | 之前的 sha512sum; 例如,完成 并得到“两全其美”:P


同样,比较存档和真实目录相对容易:

或者,仅忽略目录中的文件,基本上是一个方便的空运行 unzip -o -d "$directory"


Windows? 对不起。 虽然脚本很简单,并且很容易移植到[语法上]出色的 powershell,但它不起作用。 本机 cmdlet 仅限提取到磁盘,MS 仍然尚未修复 PS 中的二进制数据管道已损坏,因此您也无法以这种方式“安全”使用外部 zip.exe

显然其他人已经直接使用 .NET API 完成了类似的事情,但它不再是一个优雅的端口,而更多的是.NET 中的重新实现:|


<子>
关于之前提到的“非法文件名”的注释:
如果您希望它能够与这些一起工作,实际上并不太困难; 您只需将 $file$(echo "$file" | sed 's/\\/\\\\/g;s/\^J/\n /g;s/\^M/\r/g').

添加其他ctrl chars 当你遇到它们时。

原因是,出于某种原因,即使 zipinfo 显示带有 \n 其中为 ^J,它不会接受这些用于 unzip 的安全名称,只接受原始名称! 即使它可以使用 unzip -^< 提取到那些非法文件名/a> 根本无法通过 zipinfo 获取这些原始文件名。 因此,您需要从安全的、不可用的文件名中构建原始的、非法的文件名来引用它们以进行差异:(
如果执行此操作,请注意,无法区分字面上的 ^J 和显示为 ^J\n,并且 zip 不会根本不支持文件名中的 /^@


作为奖励; 您可以将所有这些差异直接写入存档,并将它们全部保存在与原始文件匹配的文件夹层次结构中,而不是尝试在一个大文件中一次读取所有内容。

不是一个漂亮的脚本,但现在您可以在选择的 gui 存档器中打开它,或者执行 unzip -p diff.zip some/dir/some.file 来具体查看与该文件的差异,或者如果没有差异,就会收到“未找到”的欢迎信息,这在实践中要漂亮得多。

\ | sort \ | while IFS= read -r file do printf 'Archive: %s\n inflating: %s\n' "$directory" "$file" cat "$directory/$file" echo done \ )

Windows? 对不起。 虽然脚本很简单,并且很容易移植到[语法上]出色的 powershell,但它不起作用。 本机 cmdlet 仅限提取到磁盘,MS 仍然尚未修复 PS 中的二进制数据管道已损坏,因此您也无法以这种方式“安全”使用外部 zip.exe

显然其他人已经直接使用 .NET API 完成了类似的事情,但它不再是一个优雅的端口,而更多的是.NET 中的重新实现:|


<子>
关于之前提到的“非法文件名”的注释:
如果您希望它能够与这些一起工作,实际上并不太困难; 您只需将 $file$(echo "$file" | sed 's/\\/\\\\/g;s/\^J/\n /g;s/\^M/\r/g').

添加其他ctrl chars 当你遇到它们时。

原因是,出于某种原因,即使 zipinfo 显示带有 \n 其中为 ^J,它不会接受这些用于 unzip 的安全名称,只接受原始名称! 即使它可以使用 unzip -^< 提取到那些非法文件名/a> 根本无法通过 zipinfo 获取这些原始文件名。 因此,您需要从安全的、不可用的文件名中构建原始的、非法的文件名来引用它们以进行差异:(
如果执行此操作,请注意,无法区分字面上的 ^J 和显示为 ^J\n,并且 zip 不会根本不支持文件名中的 /^@


作为奖励; 您可以将所有这些差异直接写入存档,并将它们全部保存在与原始文件匹配的文件夹层次结构中,而不是尝试在一个大文件中一次读取所有内容。

不是一个漂亮的脚本,但现在您可以在选择的 gui 存档器中打开它,或者执行 unzip -p diff.zip some/dir/some.file 来具体查看与该文件的差异,或者如果没有差异,就会收到“未找到”的欢迎信息,这在实践中要漂亮得多。

\ | sort \ | while IFS= read -r file; do unzip -c "$zip1" "$file"; done \ ) \ <(zipinfo -1 "$zip2" '*' \ | grep '[^/]

这会解压缩-内存,而不是磁盘,在比较时从管道中释放数据(它不会解压缩并然后进行比较,因此不应使用太多内存)。
想要更改差异选项以忽略空格或并排使用? 将 diff 更改为 diff -wgvimdiff (这会将所有文件保留在内存中)等等。
假设您只想比较 .js 文件? 将 * 更改为 *.js
只想查看其中之一缺少的文件名? 删除 while 行,就不会打扰解压缩。

简单。

它甚至可以安全地处理(跳过并将其记录到stderr)带有“非法”字符(例如换行符和反斜杠)的文件名。
没有比这更“安全”的了。

slm 的答案非常适合返回不同的文件(不显示差异),甚至根本不解压缩,这很好。 如果出于某种原因您希望比 CRC 更进一步,则在此答案中您可以添加 | 之前的 sha512sum; 例如,完成 并得到“两全其美”:P


同样,比较存档和真实目录相对容易:

或者,仅忽略目录中的文件,基本上是一个方便的空运行 unzip -o -d "$directory"


Windows? 对不起。 虽然脚本很简单,并且很容易移植到[语法上]出色的 powershell,但它不起作用。 本机 cmdlet 仅限提取到磁盘,MS 仍然尚未修复 PS 中的二进制数据管道已损坏,因此您也无法以这种方式“安全”使用外部 zip.exe

显然其他人已经直接使用 .NET API 完成了类似的事情,但它不再是一个优雅的端口,而更多的是.NET 中的重新实现:|


<子>
关于之前提到的“非法文件名”的注释:
如果您希望它能够与这些一起工作,实际上并不太困难; 您只需将 $file$(echo "$file" | sed 's/\\/\\\\/g;s/\^J/\n /g;s/\^M/\r/g').

添加其他ctrl chars 当你遇到它们时。

原因是,出于某种原因,即使 zipinfo 显示带有 \n 其中为 ^J,它不会接受这些用于 unzip 的安全名称,只接受原始名称! 即使它可以使用 unzip -^< 提取到那些非法文件名/a> 根本无法通过 zipinfo 获取这些原始文件名。 因此,您需要从安全的、不可用的文件名中构建原始的、非法的文件名来引用它们以进行差异:(
如果执行此操作,请注意,无法区分字面上的 ^J 和显示为 ^J\n,并且 zip 不会根本不支持文件名中的 /^@


作为奖励; 您可以将所有这些差异直接写入存档,并将它们全部保存在与原始文件匹配的文件夹层次结构中,而不是尝试在一个大文件中一次读取所有内容。

不是一个漂亮的脚本,但现在您可以在选择的 gui 存档器中打开它,或者执行 unzip -p diff.zip some/dir/some.file 来具体查看与该文件的差异,或者如果没有差异,就会收到“未找到”的欢迎信息,这在实践中要漂亮得多。

\ | sort \ | while IFS= read -r file; do unzip -c "$zip2" "$file"; done \ )

这会解压缩-内存,而不是磁盘,在比较时从管道中释放数据(它不会解压缩并然后进行比较,因此不应使用太多内存)。
想要更改差异选项以忽略空格或并排使用? 将 diff 更改为 diff -wgvimdiff (这会将所有文件保留在内存中)等等。
假设您只想比较 .js 文件? 将 * 更改为 *.js
只想查看其中之一缺少的文件名? 删除 while 行,就不会打扰解压缩。

简单。

它甚至可以安全地处理(跳过并将其记录到stderr)带有“非法”字符(例如换行符和反斜杠)的文件名。
没有比这更“安全”的了。

slm 的答案非常适合返回不同的文件(不显示差异),甚至根本不解压缩,这很好。 如果出于某种原因您希望比 CRC 更进一步,则在此答案中您可以添加 | 之前的 sha512sum; 例如,完成 并得到“两全其美”:P


同样,比较存档和真实目录相对容易:

或者,仅忽略目录中的文件,基本上是一个方便的空运行 unzip -o -d "$directory"


Windows? 对不起。 虽然脚本很简单,并且很容易移植到[语法上]出色的 powershell,但它不起作用。 本机 cmdlet 仅限提取到磁盘,MS 仍然尚未修复 PS 中的二进制数据管道已损坏,因此您也无法以这种方式“安全”使用外部 zip.exe

显然其他人已经直接使用 .NET API 完成了类似的事情,但它不再是一个优雅的端口,而更多的是.NET 中的重新实现:|


<子>
关于之前提到的“非法文件名”的注释:
如果您希望它能够与这些一起工作,实际上并不太困难; 您只需将 $file$(echo "$file" | sed 's/\\/\\\\/g;s/\^J/\n /g;s/\^M/\r/g').

添加其他ctrl chars 当你遇到它们时。

原因是,出于某种原因,即使 zipinfo 显示带有 \n 其中为 ^J,它不会接受这些用于 unzip 的安全名称,只接受原始名称! 即使它可以使用 unzip -^< 提取到那些非法文件名/a> 根本无法通过 zipinfo 获取这些原始文件名。 因此,您需要从安全的、不可用的文件名中构建原始的、非法的文件名来引用它们以进行差异:(
如果执行此操作,请注意,无法区分字面上的 ^J 和显示为 ^J\n,并且 zip 不会根本不支持文件名中的 /^@


作为奖励; 您可以将所有这些差异直接写入存档,并将它们全部保存在与原始文件匹配的文件夹层次结构中,而不是尝试在一个大文件中一次读取所有内容。

不是一个漂亮的脚本,但现在您可以在选择的 gui 存档器中打开它,或者执行 unzip -p diff.zip some/dir/some.file 来具体查看与该文件的差异,或者如果没有差异,就会收到“未找到”的欢迎信息,这在实践中要漂亮得多。

\ | sort \ | while IFS= read -r file; do unzip -c "$zip" "$file"; done \ ) \ <(find "$directory" -type f -name '*' \ | sort \ | while IFS= read -r file do printf 'Archive: %s\n inflating: %s\n' "$directory" `echo $file | sed "s|$directory/||"` cat "$file" echo done \ )

或者,仅忽略目录中的文件,基本上是一个方便的空运行 unzip -o -d "$directory"


Windows? 对不起。 虽然脚本很简单,并且很容易移植到[语法上]出色的 powershell,但它不起作用。 本机 cmdlet 仅限提取到磁盘,MS 仍然尚未修复 PS 中的二进制数据管道已损坏,因此您也无法以这种方式“安全”使用外部 zip.exe

显然其他人已经直接使用 .NET API 完成了类似的事情,但它不再是一个优雅的端口,而更多的是.NET 中的重新实现:|


<子>
关于之前提到的“非法文件名”的注释:
如果您希望它能够与这些一起工作,实际上并不太困难; 您只需将 $file$(echo "$file" | sed 's/\\/\\\\/g;s/\^J/\n /g;s/\^M/\r/g').

添加其他ctrl chars 当你遇到它们时。

原因是,出于某种原因,即使 zipinfo 显示带有 \n 其中为 ^J,它不会接受这些用于 unzip 的安全名称,只接受原始名称! 即使它可以使用 unzip -^< 提取到那些非法文件名/a> 根本无法通过 zipinfo 获取这些原始文件名。 因此,您需要从安全的、不可用的文件名中构建原始的、非法的文件名来引用它们以进行差异:(
如果执行此操作,请注意,无法区分字面上的 ^J 和显示为 ^J\n,并且 zip 不会根本不支持文件名中的 /^@


作为奖励; 您可以将所有这些差异直接写入存档,并将它们全部保存在与原始文件匹配的文件夹层次结构中,而不是尝试在一个大文件中一次读取所有内容。

不是一个漂亮的脚本,但现在您可以在选择的 gui 存档器中打开它,或者执行 unzip -p diff.zip some/dir/some.file 来具体查看与该文件的差异,或者如果没有差异,就会收到“未找到”的欢迎信息,这在实践中要漂亮得多。

\ | sort \ | while IFS= read -r file; do unzip -c "$zip1" "$file"; done \ ) \ <(zipinfo -1 "$zip2" '*' \ | grep '[^/]

这会解压缩-内存,而不是磁盘,在比较时从管道中释放数据(它不会解压缩并然后进行比较,因此不应使用太多内存)。
想要更改差异选项以忽略空格或并排使用? 将 diff 更改为 diff -wgvimdiff (这会将所有文件保留在内存中)等等。
假设您只想比较 .js 文件? 将 * 更改为 *.js
只想查看其中之一缺少的文件名? 删除 while 行,就不会打扰解压缩。

简单。

它甚至可以安全地处理(跳过并将其记录到stderr)带有“非法”字符(例如换行符和反斜杠)的文件名。
没有比这更“安全”的了。

slm 的答案非常适合返回不同的文件(不显示差异),甚至根本不解压缩,这很好。 如果出于某种原因您希望比 CRC 更进一步,则在此答案中您可以添加 | 之前的 sha512sum; 例如,完成 并得到“两全其美”:P


同样,比较存档和真实目录相对容易:

或者,仅忽略目录中的文件,基本上是一个方便的空运行 unzip -o -d "$directory"


Windows? 对不起。 虽然脚本很简单,并且很容易移植到[语法上]出色的 powershell,但它不起作用。 本机 cmdlet 仅限提取到磁盘,MS 仍然尚未修复 PS 中的二进制数据管道已损坏,因此您也无法以这种方式“安全”使用外部 zip.exe

显然其他人已经直接使用 .NET API 完成了类似的事情,但它不再是一个优雅的端口,而更多的是.NET 中的重新实现:|


<子>
关于之前提到的“非法文件名”的注释:
如果您希望它能够与这些一起工作,实际上并不太困难; 您只需将 $file$(echo "$file" | sed 's/\\/\\\\/g;s/\^J/\n /g;s/\^M/\r/g').

添加其他ctrl chars 当你遇到它们时。

原因是,出于某种原因,即使 zipinfo 显示带有 \n 其中为 ^J,它不会接受这些用于 unzip 的安全名称,只接受原始名称! 即使它可以使用 unzip -^< 提取到那些非法文件名/a> 根本无法通过 zipinfo 获取这些原始文件名。 因此,您需要从安全的、不可用的文件名中构建原始的、非法的文件名来引用它们以进行差异:(
如果执行此操作,请注意,无法区分字面上的 ^J 和显示为 ^J\n,并且 zip 不会根本不支持文件名中的 /^@


作为奖励; 您可以将所有这些差异直接写入存档,并将它们全部保存在与原始文件匹配的文件夹层次结构中,而不是尝试在一个大文件中一次读取所有内容。

不是一个漂亮的脚本,但现在您可以在选择的 gui 存档器中打开它,或者执行 unzip -p diff.zip some/dir/some.file 来具体查看与该文件的差异,或者如果没有差异,就会收到“未找到”的欢迎信息,这在实践中要漂亮得多。

\ | sort \ | while IFS= read -r file; do unzip -c "$zip2" "$file"; done \ )

这会解压缩-内存,而不是磁盘,在比较时从管道中释放数据(它不会解压缩并然后进行比较,因此不应使用太多内存)。
想要更改差异选项以忽略空格或并排使用? 将 diff 更改为 diff -wgvimdiff (这会将所有文件保留在内存中)等等。
假设您只想比较 .js 文件? 将 * 更改为 *.js
只想查看其中之一缺少的文件名? 删除 while 行,就不会打扰解压缩。

简单。

它甚至可以安全地处理(跳过并将其记录到stderr)带有“非法”字符(例如换行符和反斜杠)的文件名。
没有比这更“安全”的了。

slm 的答案非常适合返回不同的文件(不显示差异),甚至根本不解压缩,这很好。 如果出于某种原因您希望比 CRC 更进一步,则在此答案中您可以添加 | 之前的 sha512sum; 例如,完成 并得到“两全其美”:P


同样,比较存档和真实目录相对容易:

或者,仅忽略目录中的文件,基本上是一个方便的空运行 unzip -o -d "$directory"


Windows? 对不起。 虽然脚本很简单,并且很容易移植到[语法上]出色的 powershell,但它不起作用。 本机 cmdlet 仅限提取到磁盘,MS 仍然尚未修复 PS 中的二进制数据管道已损坏,因此您也无法以这种方式“安全”使用外部 zip.exe

显然其他人已经直接使用 .NET API 完成了类似的事情,但它不再是一个优雅的端口,而更多的是.NET 中的重新实现:|


<子>
关于之前提到的“非法文件名”的注释:
如果您希望它能够与这些一起工作,实际上并不太困难; 您只需将 $file$(echo "$file" | sed 's/\\/\\\\/g;s/\^J/\n /g;s/\^M/\r/g').

添加其他ctrl chars 当你遇到它们时。

原因是,出于某种原因,即使 zipinfo 显示带有 \n 其中为 ^J,它不会接受这些用于 unzip 的安全名称,只接受原始名称! 即使它可以使用 unzip -^< 提取到那些非法文件名/a> 根本无法通过 zipinfo 获取这些原始文件名。 因此,您需要从安全的、不可用的文件名中构建原始的、非法的文件名来引用它们以进行差异:(
如果执行此操作,请注意,无法区分字面上的 ^J 和显示为 ^J\n,并且 zip 不会根本不支持文件名中的 /^@


作为奖励; 您可以将所有这些差异直接写入存档,并将它们全部保存在与原始文件匹配的文件夹层次结构中,而不是尝试在一个大文件中一次读取所有内容。

不是一个漂亮的脚本,但现在您可以在选择的 gui 存档器中打开它,或者执行 unzip -p diff.zip some/dir/some.file 来具体查看与该文件的差异,或者如果没有差异,就会收到“未找到”的欢迎信息,这在实践中要漂亮得多。

\ | sort \ | uniq \ | while IFS= read -r file; do (diff <(unzip -p "$zip1" "$file") <(unzip -p "$zip2" "$file") | zip 'diff.zip' - \ && zipinfo -s 'diff.zip' - | awk '{ print $4; }' | grep '[^0]' \ && printf "@ -\n@=$file\n" | zipnote -w 'diff.zip' \ || zip -d 'diff.zip' - ) >/dev/null done

不是一个漂亮的脚本,但现在您可以在选择的 gui 存档器中打开它,或者执行 unzip -p diff.zip some/dir/some.file 来具体查看与该文件的差异,或者如果没有差异,就会收到“未找到”的欢迎信息,这在实践中要漂亮得多。

\ | sort \ | while IFS= read -r file; do unzip -c "$zip1" "$file"; done \ ) \ <(zipinfo -1 "$zip2" '*' \ | grep '[^/]

这会解压缩-内存,而不是磁盘,在比较时从管道中释放数据(它不会解压缩并然后进行比较,因此不应使用太多内存)。
想要更改差异选项以忽略空格或并排使用? 将 diff 更改为 diff -wgvimdiff (这会将所有文件保留在内存中)等等。
假设您只想比较 .js 文件? 将 * 更改为 *.js
只想查看其中之一缺少的文件名? 删除 while 行,就不会打扰解压缩。

简单。

它甚至可以安全地处理(跳过并将其记录到stderr)带有“非法”字符(例如换行符和反斜杠)的文件名。
没有比这更“安全”的了。

slm 的答案非常适合返回不同的文件(不显示差异),甚至根本不解压缩,这很好。 如果出于某种原因您希望比 CRC 更进一步,则在此答案中您可以添加 | 之前的 sha512sum; 例如,完成 并得到“两全其美”:P


同样,比较存档和真实目录相对容易:

或者,仅忽略目录中的文件,基本上是一个方便的空运行 unzip -o -d "$directory"


Windows? 对不起。 虽然脚本很简单,并且很容易移植到[语法上]出色的 powershell,但它不起作用。 本机 cmdlet 仅限提取到磁盘,MS 仍然尚未修复 PS 中的二进制数据管道已损坏,因此您也无法以这种方式“安全”使用外部 zip.exe

显然其他人已经直接使用 .NET API 完成了类似的事情,但它不再是一个优雅的端口,而更多的是.NET 中的重新实现:|


<子>
关于之前提到的“非法文件名”的注释:
如果您希望它能够与这些一起工作,实际上并不太困难; 您只需将 $file$(echo "$file" | sed 's/\\/\\\\/g;s/\^J/\n /g;s/\^M/\r/g').

添加其他ctrl chars 当你遇到它们时。

原因是,出于某种原因,即使 zipinfo 显示带有 \n 其中为 ^J,它不会接受这些用于 unzip 的安全名称,只接受原始名称! 即使它可以使用 unzip -^< 提取到那些非法文件名/a> 根本无法通过 zipinfo 获取这些原始文件名。 因此,您需要从安全的、不可用的文件名中构建原始的、非法的文件名来引用它们以进行差异:(
如果执行此操作,请注意,无法区分字面上的 ^J 和显示为 ^J\n,并且 zip 不会根本不支持文件名中的 /^@


作为奖励; 您可以将所有这些差异直接写入存档,并将它们全部保存在与原始文件匹配的文件夹层次结构中,而不是尝试在一个大文件中一次读取所有内容。

不是一个漂亮的脚本,但现在您可以在选择的 gui 存档器中打开它,或者执行 unzip -p diff.zip some/dir/some.file 来具体查看与该文件的差异,或者如果没有差异,就会收到“未找到”的欢迎信息,这在实践中要漂亮得多。

\ | sort \ | while IFS= read -r file; do unzip -c "$zip2" "$file"; done \ )

这会解压缩-内存,而不是磁盘,在比较时从管道中释放数据(它不会解压缩并然后进行比较,因此不应使用太多内存)。
想要更改差异选项以忽略空格或并排使用? 将 diff 更改为 diff -wgvimdiff (这会将所有文件保留在内存中)等等。
假设您只想比较 .js 文件? 将 * 更改为 *.js
只想查看其中之一缺少的文件名? 删除 while 行,就不会打扰解压缩。

简单。

它甚至可以安全地处理(跳过并将其记录到stderr)带有“非法”字符(例如换行符和反斜杠)的文件名。
没有比这更“安全”的了。

slm 的答案非常适合返回不同的文件(不显示差异),甚至根本不解压缩,这很好。 如果出于某种原因您希望比 CRC 更进一步,则在此答案中您可以添加 | 之前的 sha512sum; 例如,完成 并得到“两全其美”:P


同样,比较存档和真实目录相对容易:

或者,仅忽略目录中的文件,基本上是一个方便的空运行 unzip -o -d "$directory"


Windows? 对不起。 虽然脚本很简单,并且很容易移植到[语法上]出色的 powershell,但它不起作用。 本机 cmdlet 仅限提取到磁盘,MS 仍然尚未修复 PS 中的二进制数据管道已损坏,因此您也无法以这种方式“安全”使用外部 zip.exe

显然其他人已经直接使用 .NET API 完成了类似的事情,但它不再是一个优雅的端口,而更多的是.NET 中的重新实现:|


<子>
关于之前提到的“非法文件名”的注释:
如果您希望它能够与这些一起工作,实际上并不太困难; 您只需将 $file$(echo "$file" | sed 's/\\/\\\\/g;s/\^J/\n /g;s/\^M/\r/g').

添加其他ctrl chars 当你遇到它们时。

原因是,出于某种原因,即使 zipinfo 显示带有 \n 其中为 ^J,它不会接受这些用于 unzip 的安全名称,只接受原始名称! 即使它可以使用 unzip -^< 提取到那些非法文件名/a> 根本无法通过 zipinfo 获取这些原始文件名。 因此,您需要从安全的、不可用的文件名中构建原始的、非法的文件名来引用它们以进行差异:(
如果执行此操作,请注意,无法区分字面上的 ^J 和显示为 ^J\n,并且 zip 不会根本不支持文件名中的 /^@


作为奖励; 您可以将所有这些差异直接写入存档,并将它们全部保存在与原始文件匹配的文件夹层次结构中,而不是尝试在一个大文件中一次读取所有内容。

不是一个漂亮的脚本,但现在您可以在选择的 gui 存档器中打开它,或者执行 unzip -p diff.zip some/dir/some.file 来具体查看与该文件的差异,或者如果没有差异,就会收到“未找到”的欢迎信息,这在实践中要漂亮得多。

\ | sort \ | while IFS= read -r file; do unzip -c "$zip" "$file"; done \ ) \ <(find "$directory" -type f -name '*' \ | sort \ | while IFS= read -r file do printf 'Archive: %s\n inflating: %s\n' "$directory" `echo $file | sed "s|$directory/||"` cat "$file" echo done \ )

或者,仅忽略目录中的文件,基本上是一个方便的空运行 unzip -o -d "$directory"


Windows? 对不起。 虽然脚本很简单,并且很容易移植到[语法上]出色的 powershell,但它不起作用。 本机 cmdlet 仅限提取到磁盘,MS 仍然尚未修复 PS 中的二进制数据管道已损坏,因此您也无法以这种方式“安全”使用外部 zip.exe

显然其他人已经直接使用 .NET API 完成了类似的事情,但它不再是一个优雅的端口,而更多的是.NET 中的重新实现:|


<子>
关于之前提到的“非法文件名”的注释:
如果您希望它能够与这些一起工作,实际上并不太困难; 您只需将 $file$(echo "$file" | sed 's/\\/\\\\/g;s/\^J/\n /g;s/\^M/\r/g').

添加其他ctrl chars 当你遇到它们时。

原因是,出于某种原因,即使 zipinfo 显示带有 \n 其中为 ^J,它不会接受这些用于 unzip 的安全名称,只接受原始名称! 即使它可以使用 unzip -^< 提取到那些非法文件名/a> 根本无法通过 zipinfo 获取这些原始文件名。 因此,您需要从安全的、不可用的文件名中构建原始的、非法的文件名来引用它们以进行差异:(
如果执行此操作,请注意,无法区分字面上的 ^J 和显示为 ^J\n,并且 zip 不会根本不支持文件名中的 /^@


作为奖励; 您可以将所有这些差异直接写入存档,并将它们全部保存在与原始文件匹配的文件夹层次结构中,而不是尝试在一个大文件中一次读取所有内容。

不是一个漂亮的脚本,但现在您可以在选择的 gui 存档器中打开它,或者执行 unzip -p diff.zip some/dir/some.file 来具体查看与该文件的差异,或者如果没有差异,就会收到“未找到”的欢迎信息,这在实践中要漂亮得多。

\ | sort \ | while IFS= read -r file; do unzip -c "$zip1" "$file"; done \ ) \ <(zipinfo -1 "$zip2" '*' \ | grep '[^/]

这会解压缩-内存,而不是磁盘,在比较时从管道中释放数据(它不会解压缩并然后进行比较,因此不应使用太多内存)。
想要更改差异选项以忽略空格或并排使用? 将 diff 更改为 diff -wgvimdiff (这会将所有文件保留在内存中)等等。
假设您只想比较 .js 文件? 将 * 更改为 *.js
只想查看其中之一缺少的文件名? 删除 while 行,就不会打扰解压缩。

简单。

它甚至可以安全地处理(跳过并将其记录到stderr)带有“非法”字符(例如换行符和反斜杠)的文件名。
没有比这更“安全”的了。

slm 的答案非常适合返回不同的文件(不显示差异),甚至根本不解压缩,这很好。 如果出于某种原因您希望比 CRC 更进一步,则在此答案中您可以添加 | 之前的 sha512sum; 例如,完成 并得到“两全其美”:P


同样,比较存档和真实目录相对容易:

或者,仅忽略目录中的文件,基本上是一个方便的空运行 unzip -o -d "$directory"


Windows? 对不起。 虽然脚本很简单,并且很容易移植到[语法上]出色的 powershell,但它不起作用。 本机 cmdlet 仅限提取到磁盘,MS 仍然尚未修复 PS 中的二进制数据管道已损坏,因此您也无法以这种方式“安全”使用外部 zip.exe

显然其他人已经直接使用 .NET API 完成了类似的事情,但它不再是一个优雅的端口,而更多的是.NET 中的重新实现:|


<子>
关于之前提到的“非法文件名”的注释:
如果您希望它能够与这些一起工作,实际上并不太困难; 您只需将 $file$(echo "$file" | sed 's/\\/\\\\/g;s/\^J/\n /g;s/\^M/\r/g').

添加其他ctrl chars 当你遇到它们时。

原因是,出于某种原因,即使 zipinfo 显示带有 \n 其中为 ^J,它不会接受这些用于 unzip 的安全名称,只接受原始名称! 即使它可以使用 unzip -^< 提取到那些非法文件名/a> 根本无法通过 zipinfo 获取这些原始文件名。 因此,您需要从安全的、不可用的文件名中构建原始的、非法的文件名来引用它们以进行差异:(
如果执行此操作,请注意,无法区分字面上的 ^J 和显示为 ^J\n,并且 zip 不会根本不支持文件名中的 /^@


作为奖励; 您可以将所有这些差异直接写入存档,并将它们全部保存在与原始文件匹配的文件夹层次结构中,而不是尝试在一个大文件中一次读取所有内容。

不是一个漂亮的脚本,但现在您可以在选择的 gui 存档器中打开它,或者执行 unzip -p diff.zip some/dir/some.file 来具体查看与该文件的差异,或者如果没有差异,就会收到“未找到”的欢迎信息,这在实践中要漂亮得多。

\ | sort \ | while IFS= read -r file; do unzip -c "$zip2" "$file"; done \ )

这会解压缩-内存,而不是磁盘,在比较时从管道中释放数据(它不会解压缩并然后进行比较,因此不应使用太多内存)。
想要更改差异选项以忽略空格或并排使用? 将 diff 更改为 diff -wgvimdiff (这会将所有文件保留在内存中)等等。
假设您只想比较 .js 文件? 将 * 更改为 *.js
只想查看其中之一缺少的文件名? 删除 while 行,就不会打扰解压缩。

简单。

它甚至可以安全地处理(跳过并将其记录到stderr)带有“非法”字符(例如换行符和反斜杠)的文件名。
没有比这更“安全”的了。

slm 的答案非常适合返回不同的文件(不显示差异),甚至根本不解压缩,这很好。 如果出于某种原因您希望比 CRC 更进一步,则在此答案中您可以添加 | 之前的 sha512sum; 例如,完成 并得到“两全其美”:P


同样,比较存档和真实目录相对容易:

或者,仅忽略目录中的文件,基本上是一个方便的空运行 unzip -o -d "$directory"


Windows? 对不起。 虽然脚本很简单,并且很容易移植到[语法上]出色的 powershell,但它不起作用。 本机 cmdlet 仅限提取到磁盘,MS 仍然尚未修复 PS 中的二进制数据管道已损坏,因此您也无法以这种方式“安全”使用外部 zip.exe

显然其他人已经直接使用 .NET API 完成了类似的事情,但它不再是一个优雅的端口,而更多的是.NET 中的重新实现:|


<子>
关于之前提到的“非法文件名”的注释:
如果您希望它能够与这些一起工作,实际上并不太困难; 您只需将 $file$(echo "$file" | sed 's/\\/\\\\/g;s/\^J/\n /g;s/\^M/\r/g').

添加其他ctrl chars 当你遇到它们时。

原因是,出于某种原因,即使 zipinfo 显示带有 \n 其中为 ^J,它不会接受这些用于 unzip 的安全名称,只接受原始名称! 即使它可以使用 unzip -^< 提取到那些非法文件名/a> 根本无法通过 zipinfo 获取这些原始文件名。 因此,您需要从安全的、不可用的文件名中构建原始的、非法的文件名来引用它们以进行差异:(
如果执行此操作,请注意,无法区分字面上的 ^J 和显示为 ^J\n,并且 zip 不会根本不支持文件名中的 /^@


作为奖励; 您可以将所有这些差异直接写入存档,并将它们全部保存在与原始文件匹配的文件夹层次结构中,而不是尝试在一个大文件中一次读取所有内容。

不是一个漂亮的脚本,但现在您可以在选择的 gui 存档器中打开它,或者执行 unzip -p diff.zip some/dir/some.file 来具体查看与该文件的差异,或者如果没有差异,就会收到“未找到”的欢迎信息,这在实践中要漂亮得多。

\ | sort \ | while IFS= read -r file; do unzip -c "$zip" "$file"; done \ ) \ <(zipinfo -1 "$zip" '*' \ | grep '[^/]

Windows? 对不起。 虽然脚本很简单,并且很容易移植到[语法上]出色的 powershell,但它不起作用。 本机 cmdlet 仅限提取到磁盘,MS 仍然尚未修复 PS 中的二进制数据管道已损坏,因此您也无法以这种方式“安全”使用外部 zip.exe

显然其他人已经直接使用 .NET API 完成了类似的事情,但它不再是一个优雅的端口,而更多的是.NET 中的重新实现:|


<子>
关于之前提到的“非法文件名”的注释:
如果您希望它能够与这些一起工作,实际上并不太困难; 您只需将 $file$(echo "$file" | sed 's/\\/\\\\/g;s/\^J/\n /g;s/\^M/\r/g').

添加其他ctrl chars 当你遇到它们时。

原因是,出于某种原因,即使 zipinfo 显示带有 \n 其中为 ^J,它不会接受这些用于 unzip 的安全名称,只接受原始名称! 即使它可以使用 unzip -^< 提取到那些非法文件名/a> 根本无法通过 zipinfo 获取这些原始文件名。 因此,您需要从安全的、不可用的文件名中构建原始的、非法的文件名来引用它们以进行差异:(
如果执行此操作,请注意,无法区分字面上的 ^J 和显示为 ^J\n,并且 zip 不会根本不支持文件名中的 /^@


作为奖励; 您可以将所有这些差异直接写入存档,并将它们全部保存在与原始文件匹配的文件夹层次结构中,而不是尝试在一个大文件中一次读取所有内容。

不是一个漂亮的脚本,但现在您可以在选择的 gui 存档器中打开它,或者执行 unzip -p diff.zip some/dir/some.file 来具体查看与该文件的差异,或者如果没有差异,就会收到“未找到”的欢迎信息,这在实践中要漂亮得多。

\ | sort \ | while IFS= read -r file; do unzip -c "$zip1" "$file"; done \ ) \ <(zipinfo -1 "$zip2" '*' \ | grep '[^/]

这会解压缩-内存,而不是磁盘,在比较时从管道中释放数据(它不会解压缩并然后进行比较,因此不应使用太多内存)。
想要更改差异选项以忽略空格或并排使用? 将 diff 更改为 diff -wgvimdiff (这会将所有文件保留在内存中)等等。
假设您只想比较 .js 文件? 将 * 更改为 *.js
只想查看其中之一缺少的文件名? 删除 while 行,就不会打扰解压缩。

简单。

它甚至可以安全地处理(跳过并将其记录到stderr)带有“非法”字符(例如换行符和反斜杠)的文件名。
没有比这更“安全”的了。

slm 的答案非常适合返回不同的文件(不显示差异),甚至根本不解压缩,这很好。 如果出于某种原因您希望比 CRC 更进一步,则在此答案中您可以添加 | 之前的 sha512sum; 例如,完成 并得到“两全其美”:P


同样,比较存档和真实目录相对容易:

或者,仅忽略目录中的文件,基本上是一个方便的空运行 unzip -o -d "$directory"


Windows? 对不起。 虽然脚本很简单,并且很容易移植到[语法上]出色的 powershell,但它不起作用。 本机 cmdlet 仅限提取到磁盘,MS 仍然尚未修复 PS 中的二进制数据管道已损坏,因此您也无法以这种方式“安全”使用外部 zip.exe

显然其他人已经直接使用 .NET API 完成了类似的事情,但它不再是一个优雅的端口,而更多的是.NET 中的重新实现:|


<子>
关于之前提到的“非法文件名”的注释:
如果您希望它能够与这些一起工作,实际上并不太困难; 您只需将 $file$(echo "$file" | sed 's/\\/\\\\/g;s/\^J/\n /g;s/\^M/\r/g').

添加其他ctrl chars 当你遇到它们时。

原因是,出于某种原因,即使 zipinfo 显示带有 \n 其中为 ^J,它不会接受这些用于 unzip 的安全名称,只接受原始名称! 即使它可以使用 unzip -^< 提取到那些非法文件名/a> 根本无法通过 zipinfo 获取这些原始文件名。 因此,您需要从安全的、不可用的文件名中构建原始的、非法的文件名来引用它们以进行差异:(
如果执行此操作,请注意,无法区分字面上的 ^J 和显示为 ^J\n,并且 zip 不会根本不支持文件名中的 /^@


作为奖励; 您可以将所有这些差异直接写入存档,并将它们全部保存在与原始文件匹配的文件夹层次结构中,而不是尝试在一个大文件中一次读取所有内容。

不是一个漂亮的脚本,但现在您可以在选择的 gui 存档器中打开它,或者执行 unzip -p diff.zip some/dir/some.file 来具体查看与该文件的差异,或者如果没有差异,就会收到“未找到”的欢迎信息,这在实践中要漂亮得多。

\ | sort \ | while IFS= read -r file; do unzip -c "$zip2" "$file"; done \ )

这会解压缩-内存,而不是磁盘,在比较时从管道中释放数据(它不会解压缩并然后进行比较,因此不应使用太多内存)。
想要更改差异选项以忽略空格或并排使用? 将 diff 更改为 diff -wgvimdiff (这会将所有文件保留在内存中)等等。
假设您只想比较 .js 文件? 将 * 更改为 *.js
只想查看其中之一缺少的文件名? 删除 while 行,就不会打扰解压缩。

简单。

它甚至可以安全地处理(跳过并将其记录到stderr)带有“非法”字符(例如换行符和反斜杠)的文件名。
没有比这更“安全”的了。

slm 的答案非常适合返回不同的文件(不显示差异),甚至根本不解压缩,这很好。 如果出于某种原因您希望比 CRC 更进一步,则在此答案中您可以添加 | 之前的 sha512sum; 例如,完成 并得到“两全其美”:P


同样,比较存档和真实目录相对容易:

或者,仅忽略目录中的文件,基本上是一个方便的空运行 unzip -o -d "$directory"


Windows? 对不起。 虽然脚本很简单,并且很容易移植到[语法上]出色的 powershell,但它不起作用。 本机 cmdlet 仅限提取到磁盘,MS 仍然尚未修复 PS 中的二进制数据管道已损坏,因此您也无法以这种方式“安全”使用外部 zip.exe

显然其他人已经直接使用 .NET API 完成了类似的事情,但它不再是一个优雅的端口,而更多的是.NET 中的重新实现:|


<子>
关于之前提到的“非法文件名”的注释:
如果您希望它能够与这些一起工作,实际上并不太困难; 您只需将 $file$(echo "$file" | sed 's/\\/\\\\/g;s/\^J/\n /g;s/\^M/\r/g').

添加其他ctrl chars 当你遇到它们时。

原因是,出于某种原因,即使 zipinfo 显示带有 \n 其中为 ^J,它不会接受这些用于 unzip 的安全名称,只接受原始名称! 即使它可以使用 unzip -^< 提取到那些非法文件名/a> 根本无法通过 zipinfo 获取这些原始文件名。 因此,您需要从安全的、不可用的文件名中构建原始的、非法的文件名来引用它们以进行差异:(
如果执行此操作,请注意,无法区分字面上的 ^J 和显示为 ^J\n,并且 zip 不会根本不支持文件名中的 /^@


作为奖励; 您可以将所有这些差异直接写入存档,并将它们全部保存在与原始文件匹配的文件夹层次结构中,而不是尝试在一个大文件中一次读取所有内容。

不是一个漂亮的脚本,但现在您可以在选择的 gui 存档器中打开它,或者执行 unzip -p diff.zip some/dir/some.file 来具体查看与该文件的差异,或者如果没有差异,就会收到“未找到”的欢迎信息,这在实践中要漂亮得多。

\ | sort \ | while IFS= read -r file; do unzip -c "$zip" "$file"; done \ ) \ <(find "$directory" -type f -name '*' \ | sort \ | while IFS= read -r file do printf 'Archive: %s\n inflating: %s\n' "$directory" `echo $file | sed "s|$directory/||"` cat "$file" echo done \ )

或者,仅忽略目录中的文件,基本上是一个方便的空运行 unzip -o -d "$directory"


Windows? 对不起。 虽然脚本很简单,并且很容易移植到[语法上]出色的 powershell,但它不起作用。 本机 cmdlet 仅限提取到磁盘,MS 仍然尚未修复 PS 中的二进制数据管道已损坏,因此您也无法以这种方式“安全”使用外部 zip.exe

显然其他人已经直接使用 .NET API 完成了类似的事情,但它不再是一个优雅的端口,而更多的是.NET 中的重新实现:|


<子>
关于之前提到的“非法文件名”的注释:
如果您希望它能够与这些一起工作,实际上并不太困难; 您只需将 $file$(echo "$file" | sed 's/\\/\\\\/g;s/\^J/\n /g;s/\^M/\r/g').

添加其他ctrl chars 当你遇到它们时。

原因是,出于某种原因,即使 zipinfo 显示带有 \n 其中为 ^J,它不会接受这些用于 unzip 的安全名称,只接受原始名称! 即使它可以使用 unzip -^< 提取到那些非法文件名/a> 根本无法通过 zipinfo 获取这些原始文件名。 因此,您需要从安全的、不可用的文件名中构建原始的、非法的文件名来引用它们以进行差异:(
如果执行此操作,请注意,无法区分字面上的 ^J 和显示为 ^J\n,并且 zip 不会根本不支持文件名中的 /^@


作为奖励; 您可以将所有这些差异直接写入存档,并将它们全部保存在与原始文件匹配的文件夹层次结构中,而不是尝试在一个大文件中一次读取所有内容。

不是一个漂亮的脚本,但现在您可以在选择的 gui 存档器中打开它,或者执行 unzip -p diff.zip some/dir/some.file 来具体查看与该文件的差异,或者如果没有差异,就会收到“未找到”的欢迎信息,这在实践中要漂亮得多。

\ | sort \ | while IFS= read -r file; do unzip -c "$zip1" "$file"; done \ ) \ <(zipinfo -1 "$zip2" '*' \ | grep '[^/]

这会解压缩-内存,而不是磁盘,在比较时从管道中释放数据(它不会解压缩并然后进行比较,因此不应使用太多内存)。
想要更改差异选项以忽略空格或并排使用? 将 diff 更改为 diff -wgvimdiff (这会将所有文件保留在内存中)等等。
假设您只想比较 .js 文件? 将 * 更改为 *.js
只想查看其中之一缺少的文件名? 删除 while 行,就不会打扰解压缩。

简单。

它甚至可以安全地处理(跳过并将其记录到stderr)带有“非法”字符(例如换行符和反斜杠)的文件名。
没有比这更“安全”的了。

slm 的答案非常适合返回不同的文件(不显示差异),甚至根本不解压缩,这很好。 如果出于某种原因您希望比 CRC 更进一步,则在此答案中您可以添加 | 之前的 sha512sum; 例如,完成 并得到“两全其美”:P


同样,比较存档和真实目录相对容易:

或者,仅忽略目录中的文件,基本上是一个方便的空运行 unzip -o -d "$directory"


Windows? 对不起。 虽然脚本很简单,并且很容易移植到[语法上]出色的 powershell,但它不起作用。 本机 cmdlet 仅限提取到磁盘,MS 仍然尚未修复 PS 中的二进制数据管道已损坏,因此您也无法以这种方式“安全”使用外部 zip.exe

显然其他人已经直接使用 .NET API 完成了类似的事情,但它不再是一个优雅的端口,而更多的是.NET 中的重新实现:|


<子>
关于之前提到的“非法文件名”的注释:
如果您希望它能够与这些一起工作,实际上并不太困难; 您只需将 $file$(echo "$file" | sed 's/\\/\\\\/g;s/\^J/\n /g;s/\^M/\r/g').

添加其他ctrl chars 当你遇到它们时。

原因是,出于某种原因,即使 zipinfo 显示带有 \n 其中为 ^J,它不会接受这些用于 unzip 的安全名称,只接受原始名称! 即使它可以使用 unzip -^< 提取到那些非法文件名/a> 根本无法通过 zipinfo 获取这些原始文件名。 因此,您需要从安全的、不可用的文件名中构建原始的、非法的文件名来引用它们以进行差异:(
如果执行此操作,请注意,无法区分字面上的 ^J 和显示为 ^J\n,并且 zip 不会根本不支持文件名中的 /^@


作为奖励; 您可以将所有这些差异直接写入存档,并将它们全部保存在与原始文件匹配的文件夹层次结构中,而不是尝试在一个大文件中一次读取所有内容。

不是一个漂亮的脚本,但现在您可以在选择的 gui 存档器中打开它,或者执行 unzip -p diff.zip some/dir/some.file 来具体查看与该文件的差异,或者如果没有差异,就会收到“未找到”的欢迎信息,这在实践中要漂亮得多。

\ | sort \ | while IFS= read -r file; do unzip -c "$zip2" "$file"; done \ )

这会解压缩-内存,而不是磁盘,在比较时从管道中释放数据(它不会解压缩并然后进行比较,因此不应使用太多内存)。
想要更改差异选项以忽略空格或并排使用? 将 diff 更改为 diff -wgvimdiff (这会将所有文件保留在内存中)等等。
假设您只想比较 .js 文件? 将 * 更改为 *.js
只想查看其中之一缺少的文件名? 删除 while 行,就不会打扰解压缩。

简单。

它甚至可以安全地处理(跳过并将其记录到stderr)带有“非法”字符(例如换行符和反斜杠)的文件名。
没有比这更“安全”的了。

slm 的答案非常适合返回不同的文件(不显示差异),甚至根本不解压缩,这很好。 如果出于某种原因您希望比 CRC 更进一步,则在此答案中您可以添加 | 之前的 sha512sum; 例如,完成 并得到“两全其美”:P


同样,比较存档和真实目录相对容易:

或者,仅忽略目录中的文件,基本上是一个方便的空运行 unzip -o -d "$directory"


Windows? 对不起。 虽然脚本很简单,并且很容易移植到[语法上]出色的 powershell,但它不起作用。 本机 cmdlet 仅限提取到磁盘,MS 仍然尚未修复 PS 中的二进制数据管道已损坏,因此您也无法以这种方式“安全”使用外部 zip.exe

显然其他人已经直接使用 .NET API 完成了类似的事情,但它不再是一个优雅的端口,而更多的是.NET 中的重新实现:|


<子>
关于之前提到的“非法文件名”的注释:
如果您希望它能够与这些一起工作,实际上并不太困难; 您只需将 $file$(echo "$file" | sed 's/\\/\\\\/g;s/\^J/\n /g;s/\^M/\r/g').

添加其他ctrl chars 当你遇到它们时。

原因是,出于某种原因,即使 zipinfo 显示带有 \n 其中为 ^J,它不会接受这些用于 unzip 的安全名称,只接受原始名称! 即使它可以使用 unzip -^< 提取到那些非法文件名/a> 根本无法通过 zipinfo 获取这些原始文件名。 因此,您需要从安全的、不可用的文件名中构建原始的、非法的文件名来引用它们以进行差异:(
如果执行此操作,请注意,无法区分字面上的 ^J 和显示为 ^J\n,并且 zip 不会根本不支持文件名中的 /^@


作为奖励; 您可以将所有这些差异直接写入存档,并将它们全部保存在与原始文件匹配的文件夹层次结构中,而不是尝试在一个大文件中一次读取所有内容。

不是一个漂亮的脚本,但现在您可以在选择的 gui 存档器中打开它,或者执行 unzip -p diff.zip some/dir/some.file 来具体查看与该文件的差异,或者如果没有差异,就会收到“未找到”的欢迎信息,这在实践中要漂亮得多。

\ | sort \ | while IFS= read -r file do printf 'Archive: %s\n inflating: %s\n' "$directory" "$file" cat "$directory/$file" echo done \ )

Windows? 对不起。 虽然脚本很简单,并且很容易移植到[语法上]出色的 powershell,但它不起作用。 本机 cmdlet 仅限提取到磁盘,MS 仍然尚未修复 PS 中的二进制数据管道已损坏,因此您也无法以这种方式“安全”使用外部 zip.exe

显然其他人已经直接使用 .NET API 完成了类似的事情,但它不再是一个优雅的端口,而更多的是.NET 中的重新实现:|


<子>
关于之前提到的“非法文件名”的注释:
如果您希望它能够与这些一起工作,实际上并不太困难; 您只需将 $file$(echo "$file" | sed 's/\\/\\\\/g;s/\^J/\n /g;s/\^M/\r/g').

添加其他ctrl chars 当你遇到它们时。

原因是,出于某种原因,即使 zipinfo 显示带有 \n 其中为 ^J,它不会接受这些用于 unzip 的安全名称,只接受原始名称! 即使它可以使用 unzip -^< 提取到那些非法文件名/a> 根本无法通过 zipinfo 获取这些原始文件名。 因此,您需要从安全的、不可用的文件名中构建原始的、非法的文件名来引用它们以进行差异:(
如果执行此操作,请注意,无法区分字面上的 ^J 和显示为 ^J\n,并且 zip 不会根本不支持文件名中的 /^@


作为奖励; 您可以将所有这些差异直接写入存档,并将它们全部保存在与原始文件匹配的文件夹层次结构中,而不是尝试在一个大文件中一次读取所有内容。

不是一个漂亮的脚本,但现在您可以在选择的 gui 存档器中打开它,或者执行 unzip -p diff.zip some/dir/some.file 来具体查看与该文件的差异,或者如果没有差异,就会收到“未找到”的欢迎信息,这在实践中要漂亮得多。

\ | sort \ | while IFS= read -r file; do unzip -c "$zip1" "$file"; done \ ) \ <(zipinfo -1 "$zip2" '*' \ | grep '[^/]

这会解压缩-内存,而不是磁盘,在比较时从管道中释放数据(它不会解压缩并然后进行比较,因此不应使用太多内存)。
想要更改差异选项以忽略空格或并排使用? 将 diff 更改为 diff -wgvimdiff (这会将所有文件保留在内存中)等等。
假设您只想比较 .js 文件? 将 * 更改为 *.js
只想查看其中之一缺少的文件名? 删除 while 行,就不会打扰解压缩。

简单。

它甚至可以安全地处理(跳过并将其记录到stderr)带有“非法”字符(例如换行符和反斜杠)的文件名。
没有比这更“安全”的了。

slm 的答案非常适合返回不同的文件(不显示差异),甚至根本不解压缩,这很好。 如果出于某种原因您希望比 CRC 更进一步,则在此答案中您可以添加 | 之前的 sha512sum; 例如,完成 并得到“两全其美”:P


同样,比较存档和真实目录相对容易:

或者,仅忽略目录中的文件,基本上是一个方便的空运行 unzip -o -d "$directory"


Windows? 对不起。 虽然脚本很简单,并且很容易移植到[语法上]出色的 powershell,但它不起作用。 本机 cmdlet 仅限提取到磁盘,MS 仍然尚未修复 PS 中的二进制数据管道已损坏,因此您也无法以这种方式“安全”使用外部 zip.exe

显然其他人已经直接使用 .NET API 完成了类似的事情,但它不再是一个优雅的端口,而更多的是.NET 中的重新实现:|


<子>
关于之前提到的“非法文件名”的注释:
如果您希望它能够与这些一起工作,实际上并不太困难; 您只需将 $file$(echo "$file" | sed 's/\\/\\\\/g;s/\^J/\n /g;s/\^M/\r/g').

添加其他ctrl chars 当你遇到它们时。

原因是,出于某种原因,即使 zipinfo 显示带有 \n 其中为 ^J,它不会接受这些用于 unzip 的安全名称,只接受原始名称! 即使它可以使用 unzip -^< 提取到那些非法文件名/a> 根本无法通过 zipinfo 获取这些原始文件名。 因此,您需要从安全的、不可用的文件名中构建原始的、非法的文件名来引用它们以进行差异:(
如果执行此操作,请注意,无法区分字面上的 ^J 和显示为 ^J\n,并且 zip 不会根本不支持文件名中的 /^@


作为奖励; 您可以将所有这些差异直接写入存档,并将它们全部保存在与原始文件匹配的文件夹层次结构中,而不是尝试在一个大文件中一次读取所有内容。

不是一个漂亮的脚本,但现在您可以在选择的 gui 存档器中打开它,或者执行 unzip -p diff.zip some/dir/some.file 来具体查看与该文件的差异,或者如果没有差异,就会收到“未找到”的欢迎信息,这在实践中要漂亮得多。

\ | sort \ | while IFS= read -r file; do unzip -c "$zip2" "$file"; done \ )

这会解压缩-内存,而不是磁盘,在比较时从管道中释放数据(它不会解压缩并然后进行比较,因此不应使用太多内存)。
想要更改差异选项以忽略空格或并排使用? 将 diff 更改为 diff -wgvimdiff (这会将所有文件保留在内存中)等等。
假设您只想比较 .js 文件? 将 * 更改为 *.js
只想查看其中之一缺少的文件名? 删除 while 行,就不会打扰解压缩。

简单。

它甚至可以安全地处理(跳过并将其记录到stderr)带有“非法”字符(例如换行符和反斜杠)的文件名。
没有比这更“安全”的了。

slm 的答案非常适合返回不同的文件(不显示差异),甚至根本不解压缩,这很好。 如果出于某种原因您希望比 CRC 更进一步,则在此答案中您可以添加 | 之前的 sha512sum; 例如,完成 并得到“两全其美”:P


同样,比较存档和真实目录相对容易:

或者,仅忽略目录中的文件,基本上是一个方便的空运行 unzip -o -d "$directory"


Windows? 对不起。 虽然脚本很简单,并且很容易移植到[语法上]出色的 powershell,但它不起作用。 本机 cmdlet 仅限提取到磁盘,MS 仍然尚未修复 PS 中的二进制数据管道已损坏,因此您也无法以这种方式“安全”使用外部 zip.exe

显然其他人已经直接使用 .NET API 完成了类似的事情,但它不再是一个优雅的端口,而更多的是.NET 中的重新实现:|


<子>
关于之前提到的“非法文件名”的注释:
如果您希望它能够与这些一起工作,实际上并不太困难; 您只需将 $file$(echo "$file" | sed 's/\\/\\\\/g;s/\^J/\n /g;s/\^M/\r/g').

添加其他ctrl chars 当你遇到它们时。

原因是,出于某种原因,即使 zipinfo 显示带有 \n 其中为 ^J,它不会接受这些用于 unzip 的安全名称,只接受原始名称! 即使它可以使用 unzip -^< 提取到那些非法文件名/a> 根本无法通过 zipinfo 获取这些原始文件名。 因此,您需要从安全的、不可用的文件名中构建原始的、非法的文件名来引用它们以进行差异:(
如果执行此操作,请注意,无法区分字面上的 ^J 和显示为 ^J\n,并且 zip 不会根本不支持文件名中的 /^@


作为奖励; 您可以将所有这些差异直接写入存档,并将它们全部保存在与原始文件匹配的文件夹层次结构中,而不是尝试在一个大文件中一次读取所有内容。

不是一个漂亮的脚本,但现在您可以在选择的 gui 存档器中打开它,或者执行 unzip -p diff.zip some/dir/some.file 来具体查看与该文件的差异,或者如果没有差异,就会收到“未找到”的欢迎信息,这在实践中要漂亮得多。

\ | sort \ | while IFS= read -r file; do unzip -c "$zip" "$file"; done \ ) \ <(find "$directory" -type f -name '*' \ | sort \ | while IFS= read -r file do printf 'Archive: %s\n inflating: %s\n' "$directory" `echo $file | sed "s|$directory/||"` cat "$file" echo done \ )

或者,仅忽略目录中的文件,基本上是一个方便的空运行 unzip -o -d "$directory"


Windows? 对不起。 虽然脚本很简单,并且很容易移植到[语法上]出色的 powershell,但它不起作用。 本机 cmdlet 仅限提取到磁盘,MS 仍然尚未修复 PS 中的二进制数据管道已损坏,因此您也无法以这种方式“安全”使用外部 zip.exe

显然其他人已经直接使用 .NET API 完成了类似的事情,但它不再是一个优雅的端口,而更多的是.NET 中的重新实现:|


<子>
关于之前提到的“非法文件名”的注释:
如果您希望它能够与这些一起工作,实际上并不太困难; 您只需将 $file$(echo "$file" | sed 's/\\/\\\\/g;s/\^J/\n /g;s/\^M/\r/g').

添加其他ctrl chars 当你遇到它们时。

原因是,出于某种原因,即使 zipinfo 显示带有 \n 其中为 ^J,它不会接受这些用于 unzip 的安全名称,只接受原始名称! 即使它可以使用 unzip -^< 提取到那些非法文件名/a> 根本无法通过 zipinfo 获取这些原始文件名。 因此,您需要从安全的、不可用的文件名中构建原始的、非法的文件名来引用它们以进行差异:(
如果执行此操作,请注意,无法区分字面上的 ^J 和显示为 ^J\n,并且 zip 不会根本不支持文件名中的 /^@


作为奖励; 您可以将所有这些差异直接写入存档,并将它们全部保存在与原始文件匹配的文件夹层次结构中,而不是尝试在一个大文件中一次读取所有内容。

不是一个漂亮的脚本,但现在您可以在选择的 gui 存档器中打开它,或者执行 unzip -p diff.zip some/dir/some.file 来具体查看与该文件的差异,或者如果没有差异,就会收到“未找到”的欢迎信息,这在实践中要漂亮得多。

\ | sort \ | while IFS= read -r file; do unzip -c "$zip1" "$file"; done \ ) \ <(zipinfo -1 "$zip2" '*' \ | grep '[^/]

这会解压缩-内存,而不是磁盘,在比较时从管道中释放数据(它不会解压缩并然后进行比较,因此不应使用太多内存)。
想要更改差异选项以忽略空格或并排使用? 将 diff 更改为 diff -wgvimdiff (这会将所有文件保留在内存中)等等。
假设您只想比较 .js 文件? 将 * 更改为 *.js
只想查看其中之一缺少的文件名? 删除 while 行,就不会打扰解压缩。

简单。

它甚至可以安全地处理(跳过并将其记录到stderr)带有“非法”字符(例如换行符和反斜杠)的文件名。
没有比这更“安全”的了。

slm 的答案非常适合返回不同的文件(不显示差异),甚至根本不解压缩,这很好。 如果出于某种原因您希望比 CRC 更进一步,则在此答案中您可以添加 | 之前的 sha512sum; 例如,完成 并得到“两全其美”:P


同样,比较存档和真实目录相对容易:

或者,仅忽略目录中的文件,基本上是一个方便的空运行 unzip -o -d "$directory"


Windows? 对不起。 虽然脚本很简单,并且很容易移植到[语法上]出色的 powershell,但它不起作用。 本机 cmdlet 仅限提取到磁盘,MS 仍然尚未修复 PS 中的二进制数据管道已损坏,因此您也无法以这种方式“安全”使用外部 zip.exe

显然其他人已经直接使用 .NET API 完成了类似的事情,但它不再是一个优雅的端口,而更多的是.NET 中的重新实现:|


<子>
关于之前提到的“非法文件名”的注释:
如果您希望它能够与这些一起工作,实际上并不太困难; 您只需将 $file$(echo "$file" | sed 's/\\/\\\\/g;s/\^J/\n /g;s/\^M/\r/g').

添加其他ctrl chars 当你遇到它们时。

原因是,出于某种原因,即使 zipinfo 显示带有 \n 其中为 ^J,它不会接受这些用于 unzip 的安全名称,只接受原始名称! 即使它可以使用 unzip -^< 提取到那些非法文件名/a> 根本无法通过 zipinfo 获取这些原始文件名。 因此,您需要从安全的、不可用的文件名中构建原始的、非法的文件名来引用它们以进行差异:(
如果执行此操作,请注意,无法区分字面上的 ^J 和显示为 ^J\n,并且 zip 不会根本不支持文件名中的 /^@


作为奖励; 您可以将所有这些差异直接写入存档,并将它们全部保存在与原始文件匹配的文件夹层次结构中,而不是尝试在一个大文件中一次读取所有内容。

不是一个漂亮的脚本,但现在您可以在选择的 gui 存档器中打开它,或者执行 unzip -p diff.zip some/dir/some.file 来具体查看与该文件的差异,或者如果没有差异,就会收到“未找到”的欢迎信息,这在实践中要漂亮得多。

\ | sort \ | while IFS= read -r file; do unzip -c "$zip2" "$file"; done \ )

这会解压缩-内存,而不是磁盘,在比较时从管道中释放数据(它不会解压缩并然后进行比较,因此不应使用太多内存)。
想要更改差异选项以忽略空格或并排使用? 将 diff 更改为 diff -wgvimdiff (这会将所有文件保留在内存中)等等。
假设您只想比较 .js 文件? 将 * 更改为 *.js
只想查看其中之一缺少的文件名? 删除 while 行,就不会打扰解压缩。

简单。

它甚至可以安全地处理(跳过并将其记录到stderr)带有“非法”字符(例如换行符和反斜杠)的文件名。
没有比这更“安全”的了。

slm 的答案非常适合返回不同的文件(不显示差异),甚至根本不解压缩,这很好。 如果出于某种原因您希望比 CRC 更进一步,则在此答案中您可以添加 | 之前的 sha512sum; 例如,完成 并得到“两全其美”:P


同样,比较存档和真实目录相对容易:

或者,仅忽略目录中的文件,基本上是一个方便的空运行 unzip -o -d "$directory"


Windows? 对不起。 虽然脚本很简单,并且很容易移植到[语法上]出色的 powershell,但它不起作用。 本机 cmdlet 仅限提取到磁盘,MS 仍然尚未修复 PS 中的二进制数据管道已损坏,因此您也无法以这种方式“安全”使用外部 zip.exe

显然其他人已经直接使用 .NET API 完成了类似的事情,但它不再是一个优雅的端口,而更多的是.NET 中的重新实现:|


<子>
关于之前提到的“非法文件名”的注释:
如果您希望它能够与这些一起工作,实际上并不太困难; 您只需将 $file$(echo "$file" | sed 's/\\/\\\\/g;s/\^J/\n /g;s/\^M/\r/g').

添加其他ctrl chars 当你遇到它们时。

原因是,出于某种原因,即使 zipinfo 显示带有 \n 其中为 ^J,它不会接受这些用于 unzip 的安全名称,只接受原始名称! 即使它可以使用 unzip -^< 提取到那些非法文件名/a> 根本无法通过 zipinfo 获取这些原始文件名。 因此,您需要从安全的、不可用的文件名中构建原始的、非法的文件名来引用它们以进行差异:(
如果执行此操作,请注意,无法区分字面上的 ^J 和显示为 ^J\n,并且 zip 不会根本不支持文件名中的 /^@


作为奖励; 您可以将所有这些差异直接写入存档,并将它们全部保存在与原始文件匹配的文件夹层次结构中,而不是尝试在一个大文件中一次读取所有内容。

不是一个漂亮的脚本,但现在您可以在选择的 gui 存档器中打开它,或者执行 unzip -p diff.zip some/dir/some.file 来具体查看与该文件的差异,或者如果没有差异,就会收到“未找到”的欢迎信息,这在实践中要漂亮得多。

A lot of the solutions here are either only checking the CRC to see if differences exist, are complicated scripts, require uncompressing to disk, use external programs, or need specific compression formats other than the one you were asking about (zcat does NOT work with zip).

Here's one that's simple, easy to read, and should work wherever you have bash that shows the differences between the file contents if, like me, that's what you needed when you happened across this question:

diff \
    <(zipinfo -1 "$zip1" '*' \
    | grep '[^/]

This decompresses in-memory, not to disk, releasing data from the pipe as it diffs (it wont decompress and then compare, so shouldn't use much memory).
Want to change diffing options for ignoring whitespace or using side-by-side? Change diff to diff -w or gvimdiff (this one will keep all files in memory) et cetera.
Say you only want to diff the .js files? Change * to *.js.
Only want to see the filenames that are missing from one or the other? Remove the while line and it wont bother decompressing.

Easy.

It will even safely handle (skip and record it to stderr) filenames with "illegal" characters like newlines and backslashes.
Doesn't get "safe"r than this.

slm's answer is pretty good for returning files that are different (without showing differences) and doesn't even decompress at all which is nice. If for some reason you want that but a step above CRC, in this answer you could add | sha512sum before the ; done for example and get 'the worst of both worlds' :P


Similarly it's relatively easy to compare an archive and a real directory:

diff \
    <(zipinfo -1 "$zip" '*' \
    | grep '[^/]

Or, ignoring files only in the directory, basically a handy dry-run of unzip -o -d "$directory":

diff \
    <(zipinfo -1 "$zip" '*' \
    | grep '[^/]

Windows? Sorry. Whilst the scripts are simple and would be a cinch to port to the [syntactically] fantastic powershell, it wouldn't work. The native cmdlet only extracts to disk and MS still haven't fixed the broken binary data piping in PS so you cant "safely" use an external zip.exe in this manner either.

Apparenlty others have done similar things using the .NET API directly, but it'd become less of an elegant port and more of a reimplementation in .NET :|



A note about the "illegal filenames" mentioned before:
If you want it to work with these it actually isn't too difficult; you'll just need to swap $file with $(echo "$file" | sed 's/\\/\\\\/g;s/\^J/\n/g;s/\^M/\r/g').

Add other ctrl chars as you happen across them.

The reason is, for some reason, even though zipinfo displays a filename with \n in it as ^J, it will not accept these safe names for unzip, only the original! And even though it CAN extract to those illegal filenames with unzip -^ there's no way to get these original filenames through zipinfo at all. So you need to build the original, illegal filename from the safe, unusable one to reference them for the diff :(
If you do this, note that there is no way to distinguish between ^J literally and \n displaying as ^J, and that zip doesn't support / or ^@ within filenames at all.


As a bonus; you can write all these diffs straight to an archive and keep them all in a folder heirarchy matching the original files instead of trying to read it all at once in one big splat.

(zipinfo -1 "$zip1"; zipinfo -1 "$zip2") \
    | grep '[^/]

Not as pretty a script, but now you can open it up in your gui archiver of choice or do unzip -p diff.zip some/dir/some.file to see the differences with that file specifically, or be greeted with "not found" if there are no differences, which is much prettier in practice.

\ | sort \ | while IFS= read -r file; do unzip -c "$zip1" "$file"; done \ ) \ <(zipinfo -1 "$zip2" '*' \ | grep '[^/]

This decompresses in-memory, not to disk, releasing data from the pipe as it diffs (it wont decompress and then compare, so shouldn't use much memory).
Want to change diffing options for ignoring whitespace or using side-by-side? Change diff to diff -w or gvimdiff (this one will keep all files in memory) et cetera.
Say you only want to diff the .js files? Change * to *.js.
Only want to see the filenames that are missing from one or the other? Remove the while line and it wont bother decompressing.

Easy.

It will even safely handle (skip and record it to stderr) filenames with "illegal" characters like newlines and backslashes.
Doesn't get "safe"r than this.

slm's answer is pretty good for returning files that are different (without showing differences) and doesn't even decompress at all which is nice. If for some reason you want that but a step above CRC, in this answer you could add | sha512sum before the ; done for example and get 'the worst of both worlds' :P


Similarly it's relatively easy to compare an archive and a real directory:


Or, ignoring files only in the directory, basically a handy dry-run of unzip -o -d "$directory":



Windows? Sorry. Whilst the scripts are simple and would be a cinch to port to the [syntactically] fantastic powershell, it wouldn't work. The native cmdlet only extracts to disk and MS still haven't fixed the broken binary data piping in PS so you cant "safely" use an external zip.exe in this manner either.

Apparenlty others have done similar things using the .NET API directly, but it'd become less of an elegant port and more of a reimplementation in .NET :|



A note about the "illegal filenames" mentioned before:
If you want it to work with these it actually isn't too difficult; you'll just need to swap $file with $(echo "$file" | sed 's/\\/\\\\/g;s/\^J/\n/g;s/\^M/\r/g').

Add other ctrl chars as you happen across them.

The reason is, for some reason, even though zipinfo displays a filename with \n in it as ^J, it will not accept these safe names for unzip, only the original! And even though it CAN extract to those illegal filenames with unzip -^ there's no way to get these original filenames through zipinfo at all. So you need to build the original, illegal filename from the safe, unusable one to reference them for the diff :(
If you do this, note that there is no way to distinguish between ^J literally and \n displaying as ^J, and that zip doesn't support / or ^@ within filenames at all.


As a bonus; you can write all these diffs straight to an archive and keep them all in a folder heirarchy matching the original files instead of trying to read it all at once in one big splat.


Not as pretty a script, but now you can open it up in your gui archiver of choice or do unzip -p diff.zip some/dir/some.file to see the differences with that file specifically, or be greeted with "not found" if there are no differences, which is much prettier in practice.

\ | sort \ | while IFS= read -r file; do unzip -c "$zip2" "$file"; done \ )

This decompresses in-memory, not to disk, releasing data from the pipe as it diffs (it wont decompress and then compare, so shouldn't use much memory).
Want to change diffing options for ignoring whitespace or using side-by-side? Change diff to diff -w or gvimdiff (this one will keep all files in memory) et cetera.
Say you only want to diff the .js files? Change * to *.js.
Only want to see the filenames that are missing from one or the other? Remove the while line and it wont bother decompressing.

Easy.

It will even safely handle (skip and record it to stderr) filenames with "illegal" characters like newlines and backslashes.
Doesn't get "safe"r than this.

slm's answer is pretty good for returning files that are different (without showing differences) and doesn't even decompress at all which is nice. If for some reason you want that but a step above CRC, in this answer you could add | sha512sum before the ; done for example and get 'the worst of both worlds' :P


Similarly it's relatively easy to compare an archive and a real directory:


Or, ignoring files only in the directory, basically a handy dry-run of unzip -o -d "$directory":



Windows? Sorry. Whilst the scripts are simple and would be a cinch to port to the [syntactically] fantastic powershell, it wouldn't work. The native cmdlet only extracts to disk and MS still haven't fixed the broken binary data piping in PS so you cant "safely" use an external zip.exe in this manner either.

Apparenlty others have done similar things using the .NET API directly, but it'd become less of an elegant port and more of a reimplementation in .NET :|



A note about the "illegal filenames" mentioned before:
If you want it to work with these it actually isn't too difficult; you'll just need to swap $file with $(echo "$file" | sed 's/\\/\\\\/g;s/\^J/\n/g;s/\^M/\r/g').

Add other ctrl chars as you happen across them.

The reason is, for some reason, even though zipinfo displays a filename with \n in it as ^J, it will not accept these safe names for unzip, only the original! And even though it CAN extract to those illegal filenames with unzip -^ there's no way to get these original filenames through zipinfo at all. So you need to build the original, illegal filename from the safe, unusable one to reference them for the diff :(
If you do this, note that there is no way to distinguish between ^J literally and \n displaying as ^J, and that zip doesn't support / or ^@ within filenames at all.


As a bonus; you can write all these diffs straight to an archive and keep them all in a folder heirarchy matching the original files instead of trying to read it all at once in one big splat.


Not as pretty a script, but now you can open it up in your gui archiver of choice or do unzip -p diff.zip some/dir/some.file to see the differences with that file specifically, or be greeted with "not found" if there are no differences, which is much prettier in practice.

\ | sort \ | while IFS= read -r file; do unzip -c "$zip" "$file"; done \ ) \ <(find "$directory" -type f -name '*' \ | sort \ | while IFS= read -r file do printf 'Archive: %s\n inflating: %s\n' "$directory" `echo $file | sed "s|$directory/||"` cat "$file" echo done \ )

Or, ignoring files only in the directory, basically a handy dry-run of unzip -o -d "$directory":



Windows? Sorry. Whilst the scripts are simple and would be a cinch to port to the [syntactically] fantastic powershell, it wouldn't work. The native cmdlet only extracts to disk and MS still haven't fixed the broken binary data piping in PS so you cant "safely" use an external zip.exe in this manner either.

Apparenlty others have done similar things using the .NET API directly, but it'd become less of an elegant port and more of a reimplementation in .NET :|



A note about the "illegal filenames" mentioned before:
If you want it to work with these it actually isn't too difficult; you'll just need to swap $file with $(echo "$file" | sed 's/\\/\\\\/g;s/\^J/\n/g;s/\^M/\r/g').

Add other ctrl chars as you happen across them.

The reason is, for some reason, even though zipinfo displays a filename with \n in it as ^J, it will not accept these safe names for unzip, only the original! And even though it CAN extract to those illegal filenames with unzip -^ there's no way to get these original filenames through zipinfo at all. So you need to build the original, illegal filename from the safe, unusable one to reference them for the diff :(
If you do this, note that there is no way to distinguish between ^J literally and \n displaying as ^J, and that zip doesn't support / or ^@ within filenames at all.


As a bonus; you can write all these diffs straight to an archive and keep them all in a folder heirarchy matching the original files instead of trying to read it all at once in one big splat.


Not as pretty a script, but now you can open it up in your gui archiver of choice or do unzip -p diff.zip some/dir/some.file to see the differences with that file specifically, or be greeted with "not found" if there are no differences, which is much prettier in practice.

\ | sort \ | while IFS= read -r file; do unzip -c "$zip1" "$file"; done \ ) \ <(zipinfo -1 "$zip2" '*' \ | grep '[^/]

This decompresses in-memory, not to disk, releasing data from the pipe as it diffs (it wont decompress and then compare, so shouldn't use much memory).
Want to change diffing options for ignoring whitespace or using side-by-side? Change diff to diff -w or gvimdiff (this one will keep all files in memory) et cetera.
Say you only want to diff the .js files? Change * to *.js.
Only want to see the filenames that are missing from one or the other? Remove the while line and it wont bother decompressing.

Easy.

It will even safely handle (skip and record it to stderr) filenames with "illegal" characters like newlines and backslashes.
Doesn't get "safe"r than this.

slm's answer is pretty good for returning files that are different (without showing differences) and doesn't even decompress at all which is nice. If for some reason you want that but a step above CRC, in this answer you could add | sha512sum before the ; done for example and get 'the worst of both worlds' :P


Similarly it's relatively easy to compare an archive and a real directory:


Or, ignoring files only in the directory, basically a handy dry-run of unzip -o -d "$directory":



Windows? Sorry. Whilst the scripts are simple and would be a cinch to port to the [syntactically] fantastic powershell, it wouldn't work. The native cmdlet only extracts to disk and MS still haven't fixed the broken binary data piping in PS so you cant "safely" use an external zip.exe in this manner either.

Apparenlty others have done similar things using the .NET API directly, but it'd become less of an elegant port and more of a reimplementation in .NET :|



A note about the "illegal filenames" mentioned before:
If you want it to work with these it actually isn't too difficult; you'll just need to swap $file with $(echo "$file" | sed 's/\\/\\\\/g;s/\^J/\n/g;s/\^M/\r/g').

Add other ctrl chars as you happen across them.

The reason is, for some reason, even though zipinfo displays a filename with \n in it as ^J, it will not accept these safe names for unzip, only the original! And even though it CAN extract to those illegal filenames with unzip -^ there's no way to get these original filenames through zipinfo at all. So you need to build the original, illegal filename from the safe, unusable one to reference them for the diff :(
If you do this, note that there is no way to distinguish between ^J literally and \n displaying as ^J, and that zip doesn't support / or ^@ within filenames at all.


As a bonus; you can write all these diffs straight to an archive and keep them all in a folder heirarchy matching the original files instead of trying to read it all at once in one big splat.


Not as pretty a script, but now you can open it up in your gui archiver of choice or do unzip -p diff.zip some/dir/some.file to see the differences with that file specifically, or be greeted with "not found" if there are no differences, which is much prettier in practice.

\ | sort \ | while IFS= read -r file; do unzip -c "$zip2" "$file"; done \ )

This decompresses in-memory, not to disk, releasing data from the pipe as it diffs (it wont decompress and then compare, so shouldn't use much memory).
Want to change diffing options for ignoring whitespace or using side-by-side? Change diff to diff -w or gvimdiff (this one will keep all files in memory) et cetera.
Say you only want to diff the .js files? Change * to *.js.
Only want to see the filenames that are missing from one or the other? Remove the while line and it wont bother decompressing.

Easy.

It will even safely handle (skip and record it to stderr) filenames with "illegal" characters like newlines and backslashes.
Doesn't get "safe"r than this.

slm's answer is pretty good for returning files that are different (without showing differences) and doesn't even decompress at all which is nice. If for some reason you want that but a step above CRC, in this answer you could add | sha512sum before the ; done for example and get 'the worst of both worlds' :P


Similarly it's relatively easy to compare an archive and a real directory:


Or, ignoring files only in the directory, basically a handy dry-run of unzip -o -d "$directory":



Windows? Sorry. Whilst the scripts are simple and would be a cinch to port to the [syntactically] fantastic powershell, it wouldn't work. The native cmdlet only extracts to disk and MS still haven't fixed the broken binary data piping in PS so you cant "safely" use an external zip.exe in this manner either.

Apparenlty others have done similar things using the .NET API directly, but it'd become less of an elegant port and more of a reimplementation in .NET :|



A note about the "illegal filenames" mentioned before:
If you want it to work with these it actually isn't too difficult; you'll just need to swap $file with $(echo "$file" | sed 's/\\/\\\\/g;s/\^J/\n/g;s/\^M/\r/g').

Add other ctrl chars as you happen across them.

The reason is, for some reason, even though zipinfo displays a filename with \n in it as ^J, it will not accept these safe names for unzip, only the original! And even though it CAN extract to those illegal filenames with unzip -^ there's no way to get these original filenames through zipinfo at all. So you need to build the original, illegal filename from the safe, unusable one to reference them for the diff :(
If you do this, note that there is no way to distinguish between ^J literally and \n displaying as ^J, and that zip doesn't support / or ^@ within filenames at all.


As a bonus; you can write all these diffs straight to an archive and keep them all in a folder heirarchy matching the original files instead of trying to read it all at once in one big splat.


Not as pretty a script, but now you can open it up in your gui archiver of choice or do unzip -p diff.zip some/dir/some.file to see the differences with that file specifically, or be greeted with "not found" if there are no differences, which is much prettier in practice.

\ | sort \ | while IFS= read -r file; do unzip -c "$zip" "$file"; done \ ) \ <(zipinfo -1 "$zip" '*' \ | grep '[^/]

Windows? Sorry. Whilst the scripts are simple and would be a cinch to port to the [syntactically] fantastic powershell, it wouldn't work. The native cmdlet only extracts to disk and MS still haven't fixed the broken binary data piping in PS so you cant "safely" use an external zip.exe in this manner either.

Apparenlty others have done similar things using the .NET API directly, but it'd become less of an elegant port and more of a reimplementation in .NET :|



A note about the "illegal filenames" mentioned before:
If you want it to work with these it actually isn't too difficult; you'll just need to swap $file with $(echo "$file" | sed 's/\\/\\\\/g;s/\^J/\n/g;s/\^M/\r/g').

Add other ctrl chars as you happen across them.

The reason is, for some reason, even though zipinfo displays a filename with \n in it as ^J, it will not accept these safe names for unzip, only the original! And even though it CAN extract to those illegal filenames with unzip -^ there's no way to get these original filenames through zipinfo at all. So you need to build the original, illegal filename from the safe, unusable one to reference them for the diff :(
If you do this, note that there is no way to distinguish between ^J literally and \n displaying as ^J, and that zip doesn't support / or ^@ within filenames at all.


As a bonus; you can write all these diffs straight to an archive and keep them all in a folder heirarchy matching the original files instead of trying to read it all at once in one big splat.


Not as pretty a script, but now you can open it up in your gui archiver of choice or do unzip -p diff.zip some/dir/some.file to see the differences with that file specifically, or be greeted with "not found" if there are no differences, which is much prettier in practice.

\ | sort \ | while IFS= read -r file; do unzip -c "$zip1" "$file"; done \ ) \ <(zipinfo -1 "$zip2" '*' \ | grep '[^/]

This decompresses in-memory, not to disk, releasing data from the pipe as it diffs (it wont decompress and then compare, so shouldn't use much memory).
Want to change diffing options for ignoring whitespace or using side-by-side? Change diff to diff -w or gvimdiff (this one will keep all files in memory) et cetera.
Say you only want to diff the .js files? Change * to *.js.
Only want to see the filenames that are missing from one or the other? Remove the while line and it wont bother decompressing.

Easy.

It will even safely handle (skip and record it to stderr) filenames with "illegal" characters like newlines and backslashes.
Doesn't get "safe"r than this.

slm's answer is pretty good for returning files that are different (without showing differences) and doesn't even decompress at all which is nice. If for some reason you want that but a step above CRC, in this answer you could add | sha512sum before the ; done for example and get 'the worst of both worlds' :P


Similarly it's relatively easy to compare an archive and a real directory:


Or, ignoring files only in the directory, basically a handy dry-run of unzip -o -d "$directory":



Windows? Sorry. Whilst the scripts are simple and would be a cinch to port to the [syntactically] fantastic powershell, it wouldn't work. The native cmdlet only extracts to disk and MS still haven't fixed the broken binary data piping in PS so you cant "safely" use an external zip.exe in this manner either.

Apparenlty others have done similar things using the .NET API directly, but it'd become less of an elegant port and more of a reimplementation in .NET :|



A note about the "illegal filenames" mentioned before:
If you want it to work with these it actually isn't too difficult; you'll just need to swap $file with $(echo "$file" | sed 's/\\/\\\\/g;s/\^J/\n/g;s/\^M/\r/g').

Add other ctrl chars as you happen across them.

The reason is, for some reason, even though zipinfo displays a filename with \n in it as ^J, it will not accept these safe names for unzip, only the original! And even though it CAN extract to those illegal filenames with unzip -^ there's no way to get these original filenames through zipinfo at all. So you need to build the original, illegal filename from the safe, unusable one to reference them for the diff :(
If you do this, note that there is no way to distinguish between ^J literally and \n displaying as ^J, and that zip doesn't support / or ^@ within filenames at all.


As a bonus; you can write all these diffs straight to an archive and keep them all in a folder heirarchy matching the original files instead of trying to read it all at once in one big splat.


Not as pretty a script, but now you can open it up in your gui archiver of choice or do unzip -p diff.zip some/dir/some.file to see the differences with that file specifically, or be greeted with "not found" if there are no differences, which is much prettier in practice.

\ | sort \ | while IFS= read -r file; do unzip -c "$zip2" "$file"; done \ )

This decompresses in-memory, not to disk, releasing data from the pipe as it diffs (it wont decompress and then compare, so shouldn't use much memory).
Want to change diffing options for ignoring whitespace or using side-by-side? Change diff to diff -w or gvimdiff (this one will keep all files in memory) et cetera.
Say you only want to diff the .js files? Change * to *.js.
Only want to see the filenames that are missing from one or the other? Remove the while line and it wont bother decompressing.

Easy.

It will even safely handle (skip and record it to stderr) filenames with "illegal" characters like newlines and backslashes.
Doesn't get "safe"r than this.

slm's answer is pretty good for returning files that are different (without showing differences) and doesn't even decompress at all which is nice. If for some reason you want that but a step above CRC, in this answer you could add | sha512sum before the ; done for example and get 'the worst of both worlds' :P


Similarly it's relatively easy to compare an archive and a real directory:

Or, ignoring files only in the directory, basically a handy dry-run of unzip -o -d "$directory":


Windows? Sorry. Whilst the scripts are simple and would be a cinch to port to the [syntactically] fantastic powershell, it wouldn't work. The native cmdlet only extracts to disk and MS still haven't fixed the broken binary data piping in PS so you cant "safely" use an external zip.exe in this manner either.

Apparenlty others have done similar things using the .NET API directly, but it'd become less of an elegant port and more of a reimplementation in .NET :|



A note about the "illegal filenames" mentioned before:
If you want it to work with these it actually isn't too difficult; you'll just need to swap $file with $(echo "$file" | sed 's/\\/\\\\/g;s/\^J/\n/g;s/\^M/\r/g').

Add other ctrl chars as you happen across them.

The reason is, for some reason, even though zipinfo displays a filename with \n in it as ^J, it will not accept these safe names for unzip, only the original! And even though it CAN extract to those illegal filenames with unzip -^ there's no way to get these original filenames through zipinfo at all. So you need to build the original, illegal filename from the safe, unusable one to reference them for the diff :(
If you do this, note that there is no way to distinguish between ^J literally and \n displaying as ^J, and that zip doesn't support / or ^@ within filenames at all.


As a bonus; you can write all these diffs straight to an archive and keep them all in a folder heirarchy matching the original files instead of trying to read it all at once in one big splat.

Not as pretty a script, but now you can open it up in your gui archiver of choice or do unzip -p diff.zip some/dir/some.file to see the differences with that file specifically, or be greeted with "not found" if there are no differences, which is much prettier in practice.

\ | sort \ | while IFS= read -r file; do unzip -c "$zip" "$file"; done \ ) \ <(find "$directory" -type f -name '*' \ | sort \ | while IFS= read -r file do printf 'Archive: %s\n inflating: %s\n' "$directory" `echo $file | sed "s|$directory/||"` cat "$file" echo done \ )

Or, ignoring files only in the directory, basically a handy dry-run of unzip -o -d "$directory":


Windows? Sorry. Whilst the scripts are simple and would be a cinch to port to the [syntactically] fantastic powershell, it wouldn't work. The native cmdlet only extracts to disk and MS still haven't fixed the broken binary data piping in PS so you cant "safely" use an external zip.exe in this manner either.

Apparenlty others have done similar things using the .NET API directly, but it'd become less of an elegant port and more of a reimplementation in .NET :|



A note about the "illegal filenames" mentioned before:
If you want it to work with these it actually isn't too difficult; you'll just need to swap $file with $(echo "$file" | sed 's/\\/\\\\/g;s/\^J/\n/g;s/\^M/\r/g').

Add other ctrl chars as you happen across them.

The reason is, for some reason, even though zipinfo displays a filename with \n in it as ^J, it will not accept these safe names for unzip, only the original! And even though it CAN extract to those illegal filenames with unzip -^ there's no way to get these original filenames through zipinfo at all. So you need to build the original, illegal filename from the safe, unusable one to reference them for the diff :(
If you do this, note that there is no way to distinguish between ^J literally and \n displaying as ^J, and that zip doesn't support / or ^@ within filenames at all.


As a bonus; you can write all these diffs straight to an archive and keep them all in a folder heirarchy matching the original files instead of trying to read it all at once in one big splat.

Not as pretty a script, but now you can open it up in your gui archiver of choice or do unzip -p diff.zip some/dir/some.file to see the differences with that file specifically, or be greeted with "not found" if there are no differences, which is much prettier in practice.

\ | sort \ | while IFS= read -r file; do unzip -c "$zip1" "$file"; done \ ) \ <(zipinfo -1 "$zip2" '*' \ | grep '[^/]

This decompresses in-memory, not to disk, releasing data from the pipe as it diffs (it wont decompress and then compare, so shouldn't use much memory).
Want to change diffing options for ignoring whitespace or using side-by-side? Change diff to diff -w or gvimdiff (this one will keep all files in memory) et cetera.
Say you only want to diff the .js files? Change * to *.js.
Only want to see the filenames that are missing from one or the other? Remove the while line and it wont bother decompressing.

Easy.

It will even safely handle (skip and record it to stderr) filenames with "illegal" characters like newlines and backslashes.
Doesn't get "safe"r than this.

slm's answer is pretty good for returning files that are different (without showing differences) and doesn't even decompress at all which is nice. If for some reason you want that but a step above CRC, in this answer you could add | sha512sum before the ; done for example and get 'the worst of both worlds' :P


Similarly it's relatively easy to compare an archive and a real directory:

Or, ignoring files only in the directory, basically a handy dry-run of unzip -o -d "$directory":


Windows? Sorry. Whilst the scripts are simple and would be a cinch to port to the [syntactically] fantastic powershell, it wouldn't work. The native cmdlet only extracts to disk and MS still haven't fixed the broken binary data piping in PS so you cant "safely" use an external zip.exe in this manner either.

Apparenlty others have done similar things using the .NET API directly, but it'd become less of an elegant port and more of a reimplementation in .NET :|



A note about the "illegal filenames" mentioned before:
If you want it to work with these it actually isn't too difficult; you'll just need to swap $file with $(echo "$file" | sed 's/\\/\\\\/g;s/\^J/\n/g;s/\^M/\r/g').

Add other ctrl chars as you happen across them.

The reason is, for some reason, even though zipinfo displays a filename with \n in it as ^J, it will not accept these safe names for unzip, only the original! And even though it CAN extract to those illegal filenames with unzip -^ there's no way to get these original filenames through zipinfo at all. So you need to build the original, illegal filename from the safe, unusable one to reference them for the diff :(
If you do this, note that there is no way to distinguish between ^J literally and \n displaying as ^J, and that zip doesn't support / or ^@ within filenames at all.


As a bonus; you can write all these diffs straight to an archive and keep them all in a folder heirarchy matching the original files instead of trying to read it all at once in one big splat.

Not as pretty a script, but now you can open it up in your gui archiver of choice or do unzip -p diff.zip some/dir/some.file to see the differences with that file specifically, or be greeted with "not found" if there are no differences, which is much prettier in practice.

\ | sort \ | while IFS= read -r file; do unzip -c "$zip2" "$file"; done \ )

This decompresses in-memory, not to disk, releasing data from the pipe as it diffs (it wont decompress and then compare, so shouldn't use much memory).
Want to change diffing options for ignoring whitespace or using side-by-side? Change diff to diff -w or gvimdiff (this one will keep all files in memory) et cetera.
Say you only want to diff the .js files? Change * to *.js.
Only want to see the filenames that are missing from one or the other? Remove the while line and it wont bother decompressing.

Easy.

It will even safely handle (skip and record it to stderr) filenames with "illegal" characters like newlines and backslashes.
Doesn't get "safe"r than this.

slm's answer is pretty good for returning files that are different (without showing differences) and doesn't even decompress at all which is nice. If for some reason you want that but a step above CRC, in this answer you could add | sha512sum before the ; done for example and get 'the worst of both worlds' :P


Similarly it's relatively easy to compare an archive and a real directory:

Or, ignoring files only in the directory, basically a handy dry-run of unzip -o -d "$directory":


Windows? Sorry. Whilst the scripts are simple and would be a cinch to port to the [syntactically] fantastic powershell, it wouldn't work. The native cmdlet only extracts to disk and MS still haven't fixed the broken binary data piping in PS so you cant "safely" use an external zip.exe in this manner either.

Apparenlty others have done similar things using the .NET API directly, but it'd become less of an elegant port and more of a reimplementation in .NET :|



A note about the "illegal filenames" mentioned before:
If you want it to work with these it actually isn't too difficult; you'll just need to swap $file with $(echo "$file" | sed 's/\\/\\\\/g;s/\^J/\n/g;s/\^M/\r/g').

Add other ctrl chars as you happen across them.

The reason is, for some reason, even though zipinfo displays a filename with \n in it as ^J, it will not accept these safe names for unzip, only the original! And even though it CAN extract to those illegal filenames with unzip -^ there's no way to get these original filenames through zipinfo at all. So you need to build the original, illegal filename from the safe, unusable one to reference them for the diff :(
If you do this, note that there is no way to distinguish between ^J literally and \n displaying as ^J, and that zip doesn't support / or ^@ within filenames at all.


As a bonus; you can write all these diffs straight to an archive and keep them all in a folder heirarchy matching the original files instead of trying to read it all at once in one big splat.

Not as pretty a script, but now you can open it up in your gui archiver of choice or do unzip -p diff.zip some/dir/some.file to see the differences with that file specifically, or be greeted with "not found" if there are no differences, which is much prettier in practice.

\ | sort \ | while IFS= read -r file do printf 'Archive: %s\n inflating: %s\n' "$directory" "$file" cat "$directory/$file" echo done \ )

Windows? Sorry. Whilst the scripts are simple and would be a cinch to port to the [syntactically] fantastic powershell, it wouldn't work. The native cmdlet only extracts to disk and MS still haven't fixed the broken binary data piping in PS so you cant "safely" use an external zip.exe in this manner either.

Apparenlty others have done similar things using the .NET API directly, but it'd become less of an elegant port and more of a reimplementation in .NET :|



A note about the "illegal filenames" mentioned before:
If you want it to work with these it actually isn't too difficult; you'll just need to swap $file with $(echo "$file" | sed 's/\\/\\\\/g;s/\^J/\n/g;s/\^M/\r/g').

Add other ctrl chars as you happen across them.

The reason is, for some reason, even though zipinfo displays a filename with \n in it as ^J, it will not accept these safe names for unzip, only the original! And even though it CAN extract to those illegal filenames with unzip -^ there's no way to get these original filenames through zipinfo at all. So you need to build the original, illegal filename from the safe, unusable one to reference them for the diff :(
If you do this, note that there is no way to distinguish between ^J literally and \n displaying as ^J, and that zip doesn't support / or ^@ within filenames at all.


As a bonus; you can write all these diffs straight to an archive and keep them all in a folder heirarchy matching the original files instead of trying to read it all at once in one big splat.

Not as pretty a script, but now you can open it up in your gui archiver of choice or do unzip -p diff.zip some/dir/some.file to see the differences with that file specifically, or be greeted with "not found" if there are no differences, which is much prettier in practice.

\ | sort \ | while IFS= read -r file; do unzip -c "$zip1" "$file"; done \ ) \ <(zipinfo -1 "$zip2" '*' \ | grep '[^/]

This decompresses in-memory, not to disk, releasing data from the pipe as it diffs (it wont decompress and then compare, so shouldn't use much memory).
Want to change diffing options for ignoring whitespace or using side-by-side? Change diff to diff -w or gvimdiff (this one will keep all files in memory) et cetera.
Say you only want to diff the .js files? Change * to *.js.
Only want to see the filenames that are missing from one or the other? Remove the while line and it wont bother decompressing.

Easy.

It will even safely handle (skip and record it to stderr) filenames with "illegal" characters like newlines and backslashes.
Doesn't get "safe"r than this.

slm's answer is pretty good for returning files that are different (without showing differences) and doesn't even decompress at all which is nice. If for some reason you want that but a step above CRC, in this answer you could add | sha512sum before the ; done for example and get 'the worst of both worlds' :P


Similarly it's relatively easy to compare an archive and a real directory:

Or, ignoring files only in the directory, basically a handy dry-run of unzip -o -d "$directory":


Windows? Sorry. Whilst the scripts are simple and would be a cinch to port to the [syntactically] fantastic powershell, it wouldn't work. The native cmdlet only extracts to disk and MS still haven't fixed the broken binary data piping in PS so you cant "safely" use an external zip.exe in this manner either.

Apparenlty others have done similar things using the .NET API directly, but it'd become less of an elegant port and more of a reimplementation in .NET :|



A note about the "illegal filenames" mentioned before:
If you want it to work with these it actually isn't too difficult; you'll just need to swap $file with $(echo "$file" | sed 's/\\/\\\\/g;s/\^J/\n/g;s/\^M/\r/g').

Add other ctrl chars as you happen across them.

The reason is, for some reason, even though zipinfo displays a filename with \n in it as ^J, it will not accept these safe names for unzip, only the original! And even though it CAN extract to those illegal filenames with unzip -^ there's no way to get these original filenames through zipinfo at all. So you need to build the original, illegal filename from the safe, unusable one to reference them for the diff :(
If you do this, note that there is no way to distinguish between ^J literally and \n displaying as ^J, and that zip doesn't support / or ^@ within filenames at all.


As a bonus; you can write all these diffs straight to an archive and keep them all in a folder heirarchy matching the original files instead of trying to read it all at once in one big splat.

Not as pretty a script, but now you can open it up in your gui archiver of choice or do unzip -p diff.zip some/dir/some.file to see the differences with that file specifically, or be greeted with "not found" if there are no differences, which is much prettier in practice.

\ | sort \ | while IFS= read -r file; do unzip -c "$zip2" "$file"; done \ )

This decompresses in-memory, not to disk, releasing data from the pipe as it diffs (it wont decompress and then compare, so shouldn't use much memory).
Want to change diffing options for ignoring whitespace or using side-by-side? Change diff to diff -w or gvimdiff (this one will keep all files in memory) et cetera.
Say you only want to diff the .js files? Change * to *.js.
Only want to see the filenames that are missing from one or the other? Remove the while line and it wont bother decompressing.

Easy.

It will even safely handle (skip and record it to stderr) filenames with "illegal" characters like newlines and backslashes.
Doesn't get "safe"r than this.

slm's answer is pretty good for returning files that are different (without showing differences) and doesn't even decompress at all which is nice. If for some reason you want that but a step above CRC, in this answer you could add | sha512sum before the ; done for example and get 'the worst of both worlds' :P


Similarly it's relatively easy to compare an archive and a real directory:

Or, ignoring files only in the directory, basically a handy dry-run of unzip -o -d "$directory":


Windows? Sorry. Whilst the scripts are simple and would be a cinch to port to the [syntactically] fantastic powershell, it wouldn't work. The native cmdlet only extracts to disk and MS still haven't fixed the broken binary data piping in PS so you cant "safely" use an external zip.exe in this manner either.

Apparenlty others have done similar things using the .NET API directly, but it'd become less of an elegant port and more of a reimplementation in .NET :|



A note about the "illegal filenames" mentioned before:
If you want it to work with these it actually isn't too difficult; you'll just need to swap $file with $(echo "$file" | sed 's/\\/\\\\/g;s/\^J/\n/g;s/\^M/\r/g').

Add other ctrl chars as you happen across them.

The reason is, for some reason, even though zipinfo displays a filename with \n in it as ^J, it will not accept these safe names for unzip, only the original! And even though it CAN extract to those illegal filenames with unzip -^ there's no way to get these original filenames through zipinfo at all. So you need to build the original, illegal filename from the safe, unusable one to reference them for the diff :(
If you do this, note that there is no way to distinguish between ^J literally and \n displaying as ^J, and that zip doesn't support / or ^@ within filenames at all.


As a bonus; you can write all these diffs straight to an archive and keep them all in a folder heirarchy matching the original files instead of trying to read it all at once in one big splat.

Not as pretty a script, but now you can open it up in your gui archiver of choice or do unzip -p diff.zip some/dir/some.file to see the differences with that file specifically, or be greeted with "not found" if there are no differences, which is much prettier in practice.

\ | sort \ | while IFS= read -r file; do unzip -c "$zip" "$file"; done \ ) \ <(find "$directory" -type f -name '*' \ | sort \ | while IFS= read -r file do printf 'Archive: %s\n inflating: %s\n' "$directory" `echo $file | sed "s|$directory/||"` cat "$file" echo done \ )

Or, ignoring files only in the directory, basically a handy dry-run of unzip -o -d "$directory":


Windows? Sorry. Whilst the scripts are simple and would be a cinch to port to the [syntactically] fantastic powershell, it wouldn't work. The native cmdlet only extracts to disk and MS still haven't fixed the broken binary data piping in PS so you cant "safely" use an external zip.exe in this manner either.

Apparenlty others have done similar things using the .NET API directly, but it'd become less of an elegant port and more of a reimplementation in .NET :|



A note about the "illegal filenames" mentioned before:
If you want it to work with these it actually isn't too difficult; you'll just need to swap $file with $(echo "$file" | sed 's/\\/\\\\/g;s/\^J/\n/g;s/\^M/\r/g').

Add other ctrl chars as you happen across them.

The reason is, for some reason, even though zipinfo displays a filename with \n in it as ^J, it will not accept these safe names for unzip, only the original! And even though it CAN extract to those illegal filenames with unzip -^ there's no way to get these original filenames through zipinfo at all. So you need to build the original, illegal filename from the safe, unusable one to reference them for the diff :(
If you do this, note that there is no way to distinguish between ^J literally and \n displaying as ^J, and that zip doesn't support / or ^@ within filenames at all.


As a bonus; you can write all these diffs straight to an archive and keep them all in a folder heirarchy matching the original files instead of trying to read it all at once in one big splat.

Not as pretty a script, but now you can open it up in your gui archiver of choice or do unzip -p diff.zip some/dir/some.file to see the differences with that file specifically, or be greeted with "not found" if there are no differences, which is much prettier in practice.

\ | sort \ | while IFS= read -r file; do unzip -c "$zip1" "$file"; done \ ) \ <(zipinfo -1 "$zip2" '*' \ | grep '[^/]

This decompresses in-memory, not to disk, releasing data from the pipe as it diffs (it wont decompress and then compare, so shouldn't use much memory).
Want to change diffing options for ignoring whitespace or using side-by-side? Change diff to diff -w or gvimdiff (this one will keep all files in memory) et cetera.
Say you only want to diff the .js files? Change * to *.js.
Only want to see the filenames that are missing from one or the other? Remove the while line and it wont bother decompressing.

Easy.

It will even safely handle (skip and record it to stderr) filenames with "illegal" characters like newlines and backslashes.
Doesn't get "safe"r than this.

slm's answer is pretty good for returning files that are different (without showing differences) and doesn't even decompress at all which is nice. If for some reason you want that but a step above CRC, in this answer you could add | sha512sum before the ; done for example and get 'the worst of both worlds' :P


Similarly it's relatively easy to compare an archive and a real directory:

Or, ignoring files only in the directory, basically a handy dry-run of unzip -o -d "$directory":


Windows? Sorry. Whilst the scripts are simple and would be a cinch to port to the [syntactically] fantastic powershell, it wouldn't work. The native cmdlet only extracts to disk and MS still haven't fixed the broken binary data piping in PS so you cant "safely" use an external zip.exe in this manner either.

Apparenlty others have done similar things using the .NET API directly, but it'd become less of an elegant port and more of a reimplementation in .NET :|



A note about the "illegal filenames" mentioned before:
If you want it to work with these it actually isn't too difficult; you'll just need to swap $file with $(echo "$file" | sed 's/\\/\\\\/g;s/\^J/\n/g;s/\^M/\r/g').

Add other ctrl chars as you happen across them.

The reason is, for some reason, even though zipinfo displays a filename with \n in it as ^J, it will not accept these safe names for unzip, only the original! And even though it CAN extract to those illegal filenames with unzip -^ there's no way to get these original filenames through zipinfo at all. So you need to build the original, illegal filename from the safe, unusable one to reference them for the diff :(
If you do this, note that there is no way to distinguish between ^J literally and \n displaying as ^J, and that zip doesn't support / or ^@ within filenames at all.


As a bonus; you can write all these diffs straight to an archive and keep them all in a folder heirarchy matching the original files instead of trying to read it all at once in one big splat.

Not as pretty a script, but now you can open it up in your gui archiver of choice or do unzip -p diff.zip some/dir/some.file to see the differences with that file specifically, or be greeted with "not found" if there are no differences, which is much prettier in practice.

\ | sort \ | while IFS= read -r file; do unzip -c "$zip2" "$file"; done \ )

This decompresses in-memory, not to disk, releasing data from the pipe as it diffs (it wont decompress and then compare, so shouldn't use much memory).
Want to change diffing options for ignoring whitespace or using side-by-side? Change diff to diff -w or gvimdiff (this one will keep all files in memory) et cetera.
Say you only want to diff the .js files? Change * to *.js.
Only want to see the filenames that are missing from one or the other? Remove the while line and it wont bother decompressing.

Easy.

It will even safely handle (skip and record it to stderr) filenames with "illegal" characters like newlines and backslashes.
Doesn't get "safe"r than this.

slm's answer is pretty good for returning files that are different (without showing differences) and doesn't even decompress at all which is nice. If for some reason you want that but a step above CRC, in this answer you could add | sha512sum before the ; done for example and get 'the worst of both worlds' :P


Similarly it's relatively easy to compare an archive and a real directory:

Or, ignoring files only in the directory, basically a handy dry-run of unzip -o -d "$directory":


Windows? Sorry. Whilst the scripts are simple and would be a cinch to port to the [syntactically] fantastic powershell, it wouldn't work. The native cmdlet only extracts to disk and MS still haven't fixed the broken binary data piping in PS so you cant "safely" use an external zip.exe in this manner either.

Apparenlty others have done similar things using the .NET API directly, but it'd become less of an elegant port and more of a reimplementation in .NET :|



A note about the "illegal filenames" mentioned before:
If you want it to work with these it actually isn't too difficult; you'll just need to swap $file with $(echo "$file" | sed 's/\\/\\\\/g;s/\^J/\n/g;s/\^M/\r/g').

Add other ctrl chars as you happen across them.

The reason is, for some reason, even though zipinfo displays a filename with \n in it as ^J, it will not accept these safe names for unzip, only the original! And even though it CAN extract to those illegal filenames with unzip -^ there's no way to get these original filenames through zipinfo at all. So you need to build the original, illegal filename from the safe, unusable one to reference them for the diff :(
If you do this, note that there is no way to distinguish between ^J literally and \n displaying as ^J, and that zip doesn't support / or ^@ within filenames at all.


As a bonus; you can write all these diffs straight to an archive and keep them all in a folder heirarchy matching the original files instead of trying to read it all at once in one big splat.

Not as pretty a script, but now you can open it up in your gui archiver of choice or do unzip -p diff.zip some/dir/some.file to see the differences with that file specifically, or be greeted with "not found" if there are no differences, which is much prettier in practice.

\ | sort \ | uniq \ | while IFS= read -r file; do (diff <(unzip -p "$zip1" "$file") <(unzip -p "$zip2" "$file") | zip 'diff.zip' - \ && zipinfo -s 'diff.zip' - | awk '{ print $4; }' | grep '[^0]' \ && printf "@ -\n@=$file\n" | zipnote -w 'diff.zip' \ || zip -d 'diff.zip' - ) >/dev/null done

Not as pretty a script, but now you can open it up in your gui archiver of choice or do unzip -p diff.zip some/dir/some.file to see the differences with that file specifically, or be greeted with "not found" if there are no differences, which is much prettier in practice.

\ | sort \ | while IFS= read -r file; do unzip -c "$zip1" "$file"; done \ ) \ <(zipinfo -1 "$zip2" '*' \ | grep '[^/]

This decompresses in-memory, not to disk, releasing data from the pipe as it diffs (it wont decompress and then compare, so shouldn't use much memory).
Want to change diffing options for ignoring whitespace or using side-by-side? Change diff to diff -w or gvimdiff (this one will keep all files in memory) et cetera.
Say you only want to diff the .js files? Change * to *.js.
Only want to see the filenames that are missing from one or the other? Remove the while line and it wont bother decompressing.

Easy.

It will even safely handle (skip and record it to stderr) filenames with "illegal" characters like newlines and backslashes.
Doesn't get "safe"r than this.

slm's answer is pretty good for returning files that are different (without showing differences) and doesn't even decompress at all which is nice. If for some reason you want that but a step above CRC, in this answer you could add | sha512sum before the ; done for example and get 'the worst of both worlds' :P


Similarly it's relatively easy to compare an archive and a real directory:

Or, ignoring files only in the directory, basically a handy dry-run of unzip -o -d "$directory":


Windows? Sorry. Whilst the scripts are simple and would be a cinch to port to the [syntactically] fantastic powershell, it wouldn't work. The native cmdlet only extracts to disk and MS still haven't fixed the broken binary data piping in PS so you cant "safely" use an external zip.exe in this manner either.

Apparenlty others have done similar things using the .NET API directly, but it'd become less of an elegant port and more of a reimplementation in .NET :|



A note about the "illegal filenames" mentioned before:
If you want it to work with these it actually isn't too difficult; you'll just need to swap $file with $(echo "$file" | sed 's/\\/\\\\/g;s/\^J/\n/g;s/\^M/\r/g').

Add other ctrl chars as you happen across them.

The reason is, for some reason, even though zipinfo displays a filename with \n in it as ^J, it will not accept these safe names for unzip, only the original! And even though it CAN extract to those illegal filenames with unzip -^ there's no way to get these original filenames through zipinfo at all. So you need to build the original, illegal filename from the safe, unusable one to reference them for the diff :(
If you do this, note that there is no way to distinguish between ^J literally and \n displaying as ^J, and that zip doesn't support / or ^@ within filenames at all.


As a bonus; you can write all these diffs straight to an archive and keep them all in a folder heirarchy matching the original files instead of trying to read it all at once in one big splat.

Not as pretty a script, but now you can open it up in your gui archiver of choice or do unzip -p diff.zip some/dir/some.file to see the differences with that file specifically, or be greeted with "not found" if there are no differences, which is much prettier in practice.

\ | sort \ | while IFS= read -r file; do unzip -c "$zip2" "$file"; done \ )

This decompresses in-memory, not to disk, releasing data from the pipe as it diffs (it wont decompress and then compare, so shouldn't use much memory).
Want to change diffing options for ignoring whitespace or using side-by-side? Change diff to diff -w or gvimdiff (this one will keep all files in memory) et cetera.
Say you only want to diff the .js files? Change * to *.js.
Only want to see the filenames that are missing from one or the other? Remove the while line and it wont bother decompressing.

Easy.

It will even safely handle (skip and record it to stderr) filenames with "illegal" characters like newlines and backslashes.
Doesn't get "safe"r than this.

slm's answer is pretty good for returning files that are different (without showing differences) and doesn't even decompress at all which is nice. If for some reason you want that but a step above CRC, in this answer you could add | sha512sum before the ; done for example and get 'the worst of both worlds' :P


Similarly it's relatively easy to compare an archive and a real directory:

Or, ignoring files only in the directory, basically a handy dry-run of unzip -o -d "$directory":


Windows? Sorry. Whilst the scripts are simple and would be a cinch to port to the [syntactically] fantastic powershell, it wouldn't work. The native cmdlet only extracts to disk and MS still haven't fixed the broken binary data piping in PS so you cant "safely" use an external zip.exe in this manner either.

Apparenlty others have done similar things using the .NET API directly, but it'd become less of an elegant port and more of a reimplementation in .NET :|



A note about the "illegal filenames" mentioned before:
If you want it to work with these it actually isn't too difficult; you'll just need to swap $file with $(echo "$file" | sed 's/\\/\\\\/g;s/\^J/\n/g;s/\^M/\r/g').

Add other ctrl chars as you happen across them.

The reason is, for some reason, even though zipinfo displays a filename with \n in it as ^J, it will not accept these safe names for unzip, only the original! And even though it CAN extract to those illegal filenames with unzip -^ there's no way to get these original filenames through zipinfo at all. So you need to build the original, illegal filename from the safe, unusable one to reference them for the diff :(
If you do this, note that there is no way to distinguish between ^J literally and \n displaying as ^J, and that zip doesn't support / or ^@ within filenames at all.


As a bonus; you can write all these diffs straight to an archive and keep them all in a folder heirarchy matching the original files instead of trying to read it all at once in one big splat.

Not as pretty a script, but now you can open it up in your gui archiver of choice or do unzip -p diff.zip some/dir/some.file to see the differences with that file specifically, or be greeted with "not found" if there are no differences, which is much prettier in practice.

\ | sort \ | while IFS= read -r file; do unzip -c "$zip" "$file"; done \ ) \ <(find "$directory" -type f -name '*' \ | sort \ | while IFS= read -r file do printf 'Archive: %s\n inflating: %s\n' "$directory" `echo $file | sed "s|$directory/||"` cat "$file" echo done \ )

Or, ignoring files only in the directory, basically a handy dry-run of unzip -o -d "$directory":


Windows? Sorry. Whilst the scripts are simple and would be a cinch to port to the [syntactically] fantastic powershell, it wouldn't work. The native cmdlet only extracts to disk and MS still haven't fixed the broken binary data piping in PS so you cant "safely" use an external zip.exe in this manner either.

Apparenlty others have done similar things using the .NET API directly, but it'd become less of an elegant port and more of a reimplementation in .NET :|



A note about the "illegal filenames" mentioned before:
If you want it to work with these it actually isn't too difficult; you'll just need to swap $file with $(echo "$file" | sed 's/\\/\\\\/g;s/\^J/\n/g;s/\^M/\r/g').

Add other ctrl chars as you happen across them.

The reason is, for some reason, even though zipinfo displays a filename with \n in it as ^J, it will not accept these safe names for unzip, only the original! And even though it CAN extract to those illegal filenames with unzip -^ there's no way to get these original filenames through zipinfo at all. So you need to build the original, illegal filename from the safe, unusable one to reference them for the diff :(
If you do this, note that there is no way to distinguish between ^J literally and \n displaying as ^J, and that zip doesn't support / or ^@ within filenames at all.


As a bonus; you can write all these diffs straight to an archive and keep them all in a folder heirarchy matching the original files instead of trying to read it all at once in one big splat.

Not as pretty a script, but now you can open it up in your gui archiver of choice or do unzip -p diff.zip some/dir/some.file to see the differences with that file specifically, or be greeted with "not found" if there are no differences, which is much prettier in practice.

\ | sort \ | while IFS= read -r file; do unzip -c "$zip1" "$file"; done \ ) \ <(zipinfo -1 "$zip2" '*' \ | grep '[^/]

This decompresses in-memory, not to disk, releasing data from the pipe as it diffs (it wont decompress and then compare, so shouldn't use much memory).
Want to change diffing options for ignoring whitespace or using side-by-side? Change diff to diff -w or gvimdiff (this one will keep all files in memory) et cetera.
Say you only want to diff the .js files? Change * to *.js.
Only want to see the filenames that are missing from one or the other? Remove the while line and it wont bother decompressing.

Easy.

It will even safely handle (skip and record it to stderr) filenames with "illegal" characters like newlines and backslashes.
Doesn't get "safe"r than this.

slm's answer is pretty good for returning files that are different (without showing differences) and doesn't even decompress at all which is nice. If for some reason you want that but a step above CRC, in this answer you could add | sha512sum before the ; done for example and get 'the worst of both worlds' :P


Similarly it's relatively easy to compare an archive and a real directory:

Or, ignoring files only in the directory, basically a handy dry-run of unzip -o -d "$directory":


Windows? Sorry. Whilst the scripts are simple and would be a cinch to port to the [syntactically] fantastic powershell, it wouldn't work. The native cmdlet only extracts to disk and MS still haven't fixed the broken binary data piping in PS so you cant "safely" use an external zip.exe in this manner either.

Apparenlty others have done similar things using the .NET API directly, but it'd become less of an elegant port and more of a reimplementation in .NET :|



A note about the "illegal filenames" mentioned before:
If you want it to work with these it actually isn't too difficult; you'll just need to swap $file with $(echo "$file" | sed 's/\\/\\\\/g;s/\^J/\n/g;s/\^M/\r/g').

Add other ctrl chars as you happen across them.

The reason is, for some reason, even though zipinfo displays a filename with \n in it as ^J, it will not accept these safe names for unzip, only the original! And even though it CAN extract to those illegal filenames with unzip -^ there's no way to get these original filenames through zipinfo at all. So you need to build the original, illegal filename from the safe, unusable one to reference them for the diff :(
If you do this, note that there is no way to distinguish between ^J literally and \n displaying as ^J, and that zip doesn't support / or ^@ within filenames at all.


As a bonus; you can write all these diffs straight to an archive and keep them all in a folder heirarchy matching the original files instead of trying to read it all at once in one big splat.

Not as pretty a script, but now you can open it up in your gui archiver of choice or do unzip -p diff.zip some/dir/some.file to see the differences with that file specifically, or be greeted with "not found" if there are no differences, which is much prettier in practice.

\ | sort \ | while IFS= read -r file; do unzip -c "$zip2" "$file"; done \ )

This decompresses in-memory, not to disk, releasing data from the pipe as it diffs (it wont decompress and then compare, so shouldn't use much memory).
Want to change diffing options for ignoring whitespace or using side-by-side? Change diff to diff -w or gvimdiff (this one will keep all files in memory) et cetera.
Say you only want to diff the .js files? Change * to *.js.
Only want to see the filenames that are missing from one or the other? Remove the while line and it wont bother decompressing.

Easy.

It will even safely handle (skip and record it to stderr) filenames with "illegal" characters like newlines and backslashes.
Doesn't get "safe"r than this.

slm's answer is pretty good for returning files that are different (without showing differences) and doesn't even decompress at all which is nice. If for some reason you want that but a step above CRC, in this answer you could add | sha512sum before the ; done for example and get 'the worst of both worlds' :P


Similarly it's relatively easy to compare an archive and a real directory:

Or, ignoring files only in the directory, basically a handy dry-run of unzip -o -d "$directory":


Windows? Sorry. Whilst the scripts are simple and would be a cinch to port to the [syntactically] fantastic powershell, it wouldn't work. The native cmdlet only extracts to disk and MS still haven't fixed the broken binary data piping in PS so you cant "safely" use an external zip.exe in this manner either.

Apparenlty others have done similar things using the .NET API directly, but it'd become less of an elegant port and more of a reimplementation in .NET :|



A note about the "illegal filenames" mentioned before:
If you want it to work with these it actually isn't too difficult; you'll just need to swap $file with $(echo "$file" | sed 's/\\/\\\\/g;s/\^J/\n/g;s/\^M/\r/g').

Add other ctrl chars as you happen across them.

The reason is, for some reason, even though zipinfo displays a filename with \n in it as ^J, it will not accept these safe names for unzip, only the original! And even though it CAN extract to those illegal filenames with unzip -^ there's no way to get these original filenames through zipinfo at all. So you need to build the original, illegal filename from the safe, unusable one to reference them for the diff :(
If you do this, note that there is no way to distinguish between ^J literally and \n displaying as ^J, and that zip doesn't support / or ^@ within filenames at all.


As a bonus; you can write all these diffs straight to an archive and keep them all in a folder heirarchy matching the original files instead of trying to read it all at once in one big splat.

Not as pretty a script, but now you can open it up in your gui archiver of choice or do unzip -p diff.zip some/dir/some.file to see the differences with that file specifically, or be greeted with "not found" if there are no differences, which is much prettier in practice.

\ | sort \ | while IFS= read -r file; do unzip -c "$zip" "$file"; done \ ) \ <(zipinfo -1 "$zip" '*' \ | grep '[^/]

Windows? Sorry. Whilst the scripts are simple and would be a cinch to port to the [syntactically] fantastic powershell, it wouldn't work. The native cmdlet only extracts to disk and MS still haven't fixed the broken binary data piping in PS so you cant "safely" use an external zip.exe in this manner either.

Apparenlty others have done similar things using the .NET API directly, but it'd become less of an elegant port and more of a reimplementation in .NET :|



A note about the "illegal filenames" mentioned before:
If you want it to work with these it actually isn't too difficult; you'll just need to swap $file with $(echo "$file" | sed 's/\\/\\\\/g;s/\^J/\n/g;s/\^M/\r/g').

Add other ctrl chars as you happen across them.

The reason is, for some reason, even though zipinfo displays a filename with \n in it as ^J, it will not accept these safe names for unzip, only the original! And even though it CAN extract to those illegal filenames with unzip -^ there's no way to get these original filenames through zipinfo at all. So you need to build the original, illegal filename from the safe, unusable one to reference them for the diff :(
If you do this, note that there is no way to distinguish between ^J literally and \n displaying as ^J, and that zip doesn't support / or ^@ within filenames at all.


As a bonus; you can write all these diffs straight to an archive and keep them all in a folder heirarchy matching the original files instead of trying to read it all at once in one big splat.

Not as pretty a script, but now you can open it up in your gui archiver of choice or do unzip -p diff.zip some/dir/some.file to see the differences with that file specifically, or be greeted with "not found" if there are no differences, which is much prettier in practice.

\ | sort \ | while IFS= read -r file; do unzip -c "$zip1" "$file"; done \ ) \ <(zipinfo -1 "$zip2" '*' \ | grep '[^/]

This decompresses in-memory, not to disk, releasing data from the pipe as it diffs (it wont decompress and then compare, so shouldn't use much memory).
Want to change diffing options for ignoring whitespace or using side-by-side? Change diff to diff -w or gvimdiff (this one will keep all files in memory) et cetera.
Say you only want to diff the .js files? Change * to *.js.
Only want to see the filenames that are missing from one or the other? Remove the while line and it wont bother decompressing.

Easy.

It will even safely handle (skip and record it to stderr) filenames with "illegal" characters like newlines and backslashes.
Doesn't get "safe"r than this.

slm's answer is pretty good for returning files that are different (without showing differences) and doesn't even decompress at all which is nice. If for some reason you want that but a step above CRC, in this answer you could add | sha512sum before the ; done for example and get 'the worst of both worlds' :P


Similarly it's relatively easy to compare an archive and a real directory:

Or, ignoring files only in the directory, basically a handy dry-run of unzip -o -d "$directory":


Windows? Sorry. Whilst the scripts are simple and would be a cinch to port to the [syntactically] fantastic powershell, it wouldn't work. The native cmdlet only extracts to disk and MS still haven't fixed the broken binary data piping in PS so you cant "safely" use an external zip.exe in this manner either.

Apparenlty others have done similar things using the .NET API directly, but it'd become less of an elegant port and more of a reimplementation in .NET :|



A note about the "illegal filenames" mentioned before:
If you want it to work with these it actually isn't too difficult; you'll just need to swap $file with $(echo "$file" | sed 's/\\/\\\\/g;s/\^J/\n/g;s/\^M/\r/g').

Add other ctrl chars as you happen across them.

The reason is, for some reason, even though zipinfo displays a filename with \n in it as ^J, it will not accept these safe names for unzip, only the original! And even though it CAN extract to those illegal filenames with unzip -^ there's no way to get these original filenames through zipinfo at all. So you need to build the original, illegal filename from the safe, unusable one to reference them for the diff :(
If you do this, note that there is no way to distinguish between ^J literally and \n displaying as ^J, and that zip doesn't support / or ^@ within filenames at all.


As a bonus; you can write all these diffs straight to an archive and keep them all in a folder heirarchy matching the original files instead of trying to read it all at once in one big splat.

Not as pretty a script, but now you can open it up in your gui archiver of choice or do unzip -p diff.zip some/dir/some.file to see the differences with that file specifically, or be greeted with "not found" if there are no differences, which is much prettier in practice.

\ | sort \ | while IFS= read -r file; do unzip -c "$zip2" "$file"; done \ )

This decompresses in-memory, not to disk, releasing data from the pipe as it diffs (it wont decompress and then compare, so shouldn't use much memory).
Want to change diffing options for ignoring whitespace or using side-by-side? Change diff to diff -w or gvimdiff (this one will keep all files in memory) et cetera.
Say you only want to diff the .js files? Change * to *.js.
Only want to see the filenames that are missing from one or the other? Remove the while line and it wont bother decompressing.

Easy.

It will even safely handle (skip and record it to stderr) filenames with "illegal" characters like newlines and backslashes.
Doesn't get "safe"r than this.

slm's answer is pretty good for returning files that are different (without showing differences) and doesn't even decompress at all which is nice. If for some reason you want that but a step above CRC, in this answer you could add | sha512sum before the ; done for example and get 'the worst of both worlds' :P


Similarly it's relatively easy to compare an archive and a real directory:

Or, ignoring files only in the directory, basically a handy dry-run of unzip -o -d "$directory":


Windows? Sorry. Whilst the scripts are simple and would be a cinch to port to the [syntactically] fantastic powershell, it wouldn't work. The native cmdlet only extracts to disk and MS still haven't fixed the broken binary data piping in PS so you cant "safely" use an external zip.exe in this manner either.

Apparenlty others have done similar things using the .NET API directly, but it'd become less of an elegant port and more of a reimplementation in .NET :|



A note about the "illegal filenames" mentioned before:
If you want it to work with these it actually isn't too difficult; you'll just need to swap $file with $(echo "$file" | sed 's/\\/\\\\/g;s/\^J/\n/g;s/\^M/\r/g').

Add other ctrl chars as you happen across them.

The reason is, for some reason, even though zipinfo displays a filename with \n in it as ^J, it will not accept these safe names for unzip, only the original! And even though it CAN extract to those illegal filenames with unzip -^ there's no way to get these original filenames through zipinfo at all. So you need to build the original, illegal filename from the safe, unusable one to reference them for the diff :(
If you do this, note that there is no way to distinguish between ^J literally and \n displaying as ^J, and that zip doesn't support / or ^@ within filenames at all.


As a bonus; you can write all these diffs straight to an archive and keep them all in a folder heirarchy matching the original files instead of trying to read it all at once in one big splat.

Not as pretty a script, but now you can open it up in your gui archiver of choice or do unzip -p diff.zip some/dir/some.file to see the differences with that file specifically, or be greeted with "not found" if there are no differences, which is much prettier in practice.

\ | sort \ | while IFS= read -r file; do unzip -c "$zip" "$file"; done \ ) \ <(find "$directory" -type f -name '*' \ | sort \ | while IFS= read -r file do printf 'Archive: %s\n inflating: %s\n' "$directory" `echo $file | sed "s|$directory/||"` cat "$file" echo done \ )

Or, ignoring files only in the directory, basically a handy dry-run of unzip -o -d "$directory":


Windows? Sorry. Whilst the scripts are simple and would be a cinch to port to the [syntactically] fantastic powershell, it wouldn't work. The native cmdlet only extracts to disk and MS still haven't fixed the broken binary data piping in PS so you cant "safely" use an external zip.exe in this manner either.

Apparenlty others have done similar things using the .NET API directly, but it'd become less of an elegant port and more of a reimplementation in .NET :|



A note about the "illegal filenames" mentioned before:
If you want it to work with these it actually isn't too difficult; you'll just need to swap $file with $(echo "$file" | sed 's/\\/\\\\/g;s/\^J/\n/g;s/\^M/\r/g').

Add other ctrl chars as you happen across them.

The reason is, for some reason, even though zipinfo displays a filename with \n in it as ^J, it will not accept these safe names for unzip, only the original! And even though it CAN extract to those illegal filenames with unzip -^ there's no way to get these original filenames through zipinfo at all. So you need to build the original, illegal filename from the safe, unusable one to reference them for the diff :(
If you do this, note that there is no way to distinguish between ^J literally and \n displaying as ^J, and that zip doesn't support / or ^@ within filenames at all.


As a bonus; you can write all these diffs straight to an archive and keep them all in a folder heirarchy matching the original files instead of trying to read it all at once in one big splat.

Not as pretty a script, but now you can open it up in your gui archiver of choice or do unzip -p diff.zip some/dir/some.file to see the differences with that file specifically, or be greeted with "not found" if there are no differences, which is much prettier in practice.

\ | sort \ | while IFS= read -r file; do unzip -c "$zip1" "$file"; done \ ) \ <(zipinfo -1 "$zip2" '*' \ | grep '[^/]

This decompresses in-memory, not to disk, releasing data from the pipe as it diffs (it wont decompress and then compare, so shouldn't use much memory).
Want to change diffing options for ignoring whitespace or using side-by-side? Change diff to diff -w or gvimdiff (this one will keep all files in memory) et cetera.
Say you only want to diff the .js files? Change * to *.js.
Only want to see the filenames that are missing from one or the other? Remove the while line and it wont bother decompressing.

Easy.

It will even safely handle (skip and record it to stderr) filenames with "illegal" characters like newlines and backslashes.
Doesn't get "safe"r than this.

slm's answer is pretty good for returning files that are different (without showing differences) and doesn't even decompress at all which is nice. If for some reason you want that but a step above CRC, in this answer you could add | sha512sum before the ; done for example and get 'the worst of both worlds' :P


Similarly it's relatively easy to compare an archive and a real directory:

Or, ignoring files only in the directory, basically a handy dry-run of unzip -o -d "$directory":


Windows? Sorry. Whilst the scripts are simple and would be a cinch to port to the [syntactically] fantastic powershell, it wouldn't work. The native cmdlet only extracts to disk and MS still haven't fixed the broken binary data piping in PS so you cant "safely" use an external zip.exe in this manner either.

Apparenlty others have done similar things using the .NET API directly, but it'd become less of an elegant port and more of a reimplementation in .NET :|



A note about the "illegal filenames" mentioned before:
If you want it to work with these it actually isn't too difficult; you'll just need to swap $file with $(echo "$file" | sed 's/\\/\\\\/g;s/\^J/\n/g;s/\^M/\r/g').

Add other ctrl chars as you happen across them.

The reason is, for some reason, even though zipinfo displays a filename with \n in it as ^J, it will not accept these safe names for unzip, only the original! And even though it CAN extract to those illegal filenames with unzip -^ there's no way to get these original filenames through zipinfo at all. So you need to build the original, illegal filename from the safe, unusable one to reference them for the diff :(
If you do this, note that there is no way to distinguish between ^J literally and \n displaying as ^J, and that zip doesn't support / or ^@ within filenames at all.


As a bonus; you can write all these diffs straight to an archive and keep them all in a folder heirarchy matching the original files instead of trying to read it all at once in one big splat.

Not as pretty a script, but now you can open it up in your gui archiver of choice or do unzip -p diff.zip some/dir/some.file to see the differences with that file specifically, or be greeted with "not found" if there are no differences, which is much prettier in practice.

\ | sort \ | while IFS= read -r file; do unzip -c "$zip2" "$file"; done \ )

This decompresses in-memory, not to disk, releasing data from the pipe as it diffs (it wont decompress and then compare, so shouldn't use much memory).
Want to change diffing options for ignoring whitespace or using side-by-side? Change diff to diff -w or gvimdiff (this one will keep all files in memory) et cetera.
Say you only want to diff the .js files? Change * to *.js.
Only want to see the filenames that are missing from one or the other? Remove the while line and it wont bother decompressing.

Easy.

It will even safely handle (skip and record it to stderr) filenames with "illegal" characters like newlines and backslashes.
Doesn't get "safe"r than this.

slm's answer is pretty good for returning files that are different (without showing differences) and doesn't even decompress at all which is nice. If for some reason you want that but a step above CRC, in this answer you could add | sha512sum before the ; done for example and get 'the worst of both worlds' :P


Similarly it's relatively easy to compare an archive and a real directory:

Or, ignoring files only in the directory, basically a handy dry-run of unzip -o -d "$directory":


Windows? Sorry. Whilst the scripts are simple and would be a cinch to port to the [syntactically] fantastic powershell, it wouldn't work. The native cmdlet only extracts to disk and MS still haven't fixed the broken binary data piping in PS so you cant "safely" use an external zip.exe in this manner either.

Apparenlty others have done similar things using the .NET API directly, but it'd become less of an elegant port and more of a reimplementation in .NET :|



A note about the "illegal filenames" mentioned before:
If you want it to work with these it actually isn't too difficult; you'll just need to swap $file with $(echo "$file" | sed 's/\\/\\\\/g;s/\^J/\n/g;s/\^M/\r/g').

Add other ctrl chars as you happen across them.

The reason is, for some reason, even though zipinfo displays a filename with \n in it as ^J, it will not accept these safe names for unzip, only the original! And even though it CAN extract to those illegal filenames with unzip -^ there's no way to get these original filenames through zipinfo at all. So you need to build the original, illegal filename from the safe, unusable one to reference them for the diff :(
If you do this, note that there is no way to distinguish between ^J literally and \n displaying as ^J, and that zip doesn't support / or ^@ within filenames at all.


As a bonus; you can write all these diffs straight to an archive and keep them all in a folder heirarchy matching the original files instead of trying to read it all at once in one big splat.

Not as pretty a script, but now you can open it up in your gui archiver of choice or do unzip -p diff.zip some/dir/some.file to see the differences with that file specifically, or be greeted with "not found" if there are no differences, which is much prettier in practice.

\ | sort \ | while IFS= read -r file do printf 'Archive: %s\n inflating: %s\n' "$directory" "$file" cat "$directory/$file" echo done \ )

Windows? Sorry. Whilst the scripts are simple and would be a cinch to port to the [syntactically] fantastic powershell, it wouldn't work. The native cmdlet only extracts to disk and MS still haven't fixed the broken binary data piping in PS so you cant "safely" use an external zip.exe in this manner either.

Apparenlty others have done similar things using the .NET API directly, but it'd become less of an elegant port and more of a reimplementation in .NET :|



A note about the "illegal filenames" mentioned before:
If you want it to work with these it actually isn't too difficult; you'll just need to swap $file with $(echo "$file" | sed 's/\\/\\\\/g;s/\^J/\n/g;s/\^M/\r/g').

Add other ctrl chars as you happen across them.

The reason is, for some reason, even though zipinfo displays a filename with \n in it as ^J, it will not accept these safe names for unzip, only the original! And even though it CAN extract to those illegal filenames with unzip -^ there's no way to get these original filenames through zipinfo at all. So you need to build the original, illegal filename from the safe, unusable one to reference them for the diff :(
If you do this, note that there is no way to distinguish between ^J literally and \n displaying as ^J, and that zip doesn't support / or ^@ within filenames at all.


As a bonus; you can write all these diffs straight to an archive and keep them all in a folder heirarchy matching the original files instead of trying to read it all at once in one big splat.

Not as pretty a script, but now you can open it up in your gui archiver of choice or do unzip -p diff.zip some/dir/some.file to see the differences with that file specifically, or be greeted with "not found" if there are no differences, which is much prettier in practice.

\ | sort \ | while IFS= read -r file; do unzip -c "$zip1" "$file"; done \ ) \ <(zipinfo -1 "$zip2" '*' \ | grep '[^/]

This decompresses in-memory, not to disk, releasing data from the pipe as it diffs (it wont decompress and then compare, so shouldn't use much memory).
Want to change diffing options for ignoring whitespace or using side-by-side? Change diff to diff -w or gvimdiff (this one will keep all files in memory) et cetera.
Say you only want to diff the .js files? Change * to *.js.
Only want to see the filenames that are missing from one or the other? Remove the while line and it wont bother decompressing.

Easy.

It will even safely handle (skip and record it to stderr) filenames with "illegal" characters like newlines and backslashes.
Doesn't get "safe"r than this.

slm's answer is pretty good for returning files that are different (without showing differences) and doesn't even decompress at all which is nice. If for some reason you want that but a step above CRC, in this answer you could add | sha512sum before the ; done for example and get 'the worst of both worlds' :P


Similarly it's relatively easy to compare an archive and a real directory:

Or, ignoring files only in the directory, basically a handy dry-run of unzip -o -d "$directory":


Windows? Sorry. Whilst the scripts are simple and would be a cinch to port to the [syntactically] fantastic powershell, it wouldn't work. The native cmdlet only extracts to disk and MS still haven't fixed the broken binary data piping in PS so you cant "safely" use an external zip.exe in this manner either.

Apparenlty others have done similar things using the .NET API directly, but it'd become less of an elegant port and more of a reimplementation in .NET :|



A note about the "illegal filenames" mentioned before:
If you want it to work with these it actually isn't too difficult; you'll just need to swap $file with $(echo "$file" | sed 's/\\/\\\\/g;s/\^J/\n/g;s/\^M/\r/g').

Add other ctrl chars as you happen across them.

The reason is, for some reason, even though zipinfo displays a filename with \n in it as ^J, it will not accept these safe names for unzip, only the original! And even though it CAN extract to those illegal filenames with unzip -^ there's no way to get these original filenames through zipinfo at all. So you need to build the original, illegal filename from the safe, unusable one to reference them for the diff :(
If you do this, note that there is no way to distinguish between ^J literally and \n displaying as ^J, and that zip doesn't support / or ^@ within filenames at all.


As a bonus; you can write all these diffs straight to an archive and keep them all in a folder heirarchy matching the original files instead of trying to read it all at once in one big splat.

Not as pretty a script, but now you can open it up in your gui archiver of choice or do unzip -p diff.zip some/dir/some.file to see the differences with that file specifically, or be greeted with "not found" if there are no differences, which is much prettier in practice.

\ | sort \ | while IFS= read -r file; do unzip -c "$zip2" "$file"; done \ )

This decompresses in-memory, not to disk, releasing data from the pipe as it diffs (it wont decompress and then compare, so shouldn't use much memory).
Want to change diffing options for ignoring whitespace or using side-by-side? Change diff to diff -w or gvimdiff (this one will keep all files in memory) et cetera.
Say you only want to diff the .js files? Change * to *.js.
Only want to see the filenames that are missing from one or the other? Remove the while line and it wont bother decompressing.

Easy.

It will even safely handle (skip and record it to stderr) filenames with "illegal" characters like newlines and backslashes.
Doesn't get "safe"r than this.

slm's answer is pretty good for returning files that are different (without showing differences) and doesn't even decompress at all which is nice. If for some reason you want that but a step above CRC, in this answer you could add | sha512sum before the ; done for example and get 'the worst of both worlds' :P


Similarly it's relatively easy to compare an archive and a real directory:

Or, ignoring files only in the directory, basically a handy dry-run of unzip -o -d "$directory":


Windows? Sorry. Whilst the scripts are simple and would be a cinch to port to the [syntactically] fantastic powershell, it wouldn't work. The native cmdlet only extracts to disk and MS still haven't fixed the broken binary data piping in PS so you cant "safely" use an external zip.exe in this manner either.

Apparenlty others have done similar things using the .NET API directly, but it'd become less of an elegant port and more of a reimplementation in .NET :|



A note about the "illegal filenames" mentioned before:
If you want it to work with these it actually isn't too difficult; you'll just need to swap $file with $(echo "$file" | sed 's/\\/\\\\/g;s/\^J/\n/g;s/\^M/\r/g').

Add other ctrl chars as you happen across them.

The reason is, for some reason, even though zipinfo displays a filename with \n in it as ^J, it will not accept these safe names for unzip, only the original! And even though it CAN extract to those illegal filenames with unzip -^ there's no way to get these original filenames through zipinfo at all. So you need to build the original, illegal filename from the safe, unusable one to reference them for the diff :(
If you do this, note that there is no way to distinguish between ^J literally and \n displaying as ^J, and that zip doesn't support / or ^@ within filenames at all.


As a bonus; you can write all these diffs straight to an archive and keep them all in a folder heirarchy matching the original files instead of trying to read it all at once in one big splat.

Not as pretty a script, but now you can open it up in your gui archiver of choice or do unzip -p diff.zip some/dir/some.file to see the differences with that file specifically, or be greeted with "not found" if there are no differences, which is much prettier in practice.

\ | sort \ | while IFS= read -r file; do unzip -c "$zip" "$file"; done \ ) \ <(find "$directory" -type f -name '*' \ | sort \ | while IFS= read -r file do printf 'Archive: %s\n inflating: %s\n' "$directory" `echo $file | sed "s|$directory/||"` cat "$file" echo done \ )

Or, ignoring files only in the directory, basically a handy dry-run of unzip -o -d "$directory":


Windows? Sorry. Whilst the scripts are simple and would be a cinch to port to the [syntactically] fantastic powershell, it wouldn't work. The native cmdlet only extracts to disk and MS still haven't fixed the broken binary data piping in PS so you cant "safely" use an external zip.exe in this manner either.

Apparenlty others have done similar things using the .NET API directly, but it'd become less of an elegant port and more of a reimplementation in .NET :|



A note about the "illegal filenames" mentioned before:
If you want it to work with these it actually isn't too difficult; you'll just need to swap $file with $(echo "$file" | sed 's/\\/\\\\/g;s/\^J/\n/g;s/\^M/\r/g').

Add other ctrl chars as you happen across them.

The reason is, for some reason, even though zipinfo displays a filename with \n in it as ^J, it will not accept these safe names for unzip, only the original! And even though it CAN extract to those illegal filenames with unzip -^ there's no way to get these original filenames through zipinfo at all. So you need to build the original, illegal filename from the safe, unusable one to reference them for the diff :(
If you do this, note that there is no way to distinguish between ^J literally and \n displaying as ^J, and that zip doesn't support / or ^@ within filenames at all.


As a bonus; you can write all these diffs straight to an archive and keep them all in a folder heirarchy matching the original files instead of trying to read it all at once in one big splat.

Not as pretty a script, but now you can open it up in your gui archiver of choice or do unzip -p diff.zip some/dir/some.file to see the differences with that file specifically, or be greeted with "not found" if there are no differences, which is much prettier in practice.

\ | sort \ | while IFS= read -r file; do unzip -c "$zip1" "$file"; done \ ) \ <(zipinfo -1 "$zip2" '*' \ | grep '[^/]

This decompresses in-memory, not to disk, releasing data from the pipe as it diffs (it wont decompress and then compare, so shouldn't use much memory).
Want to change diffing options for ignoring whitespace or using side-by-side? Change diff to diff -w or gvimdiff (this one will keep all files in memory) et cetera.
Say you only want to diff the .js files? Change * to *.js.
Only want to see the filenames that are missing from one or the other? Remove the while line and it wont bother decompressing.

Easy.

It will even safely handle (skip and record it to stderr) filenames with "illegal" characters like newlines and backslashes.
Doesn't get "safe"r than this.

slm's answer is pretty good for returning files that are different (without showing differences) and doesn't even decompress at all which is nice. If for some reason you want that but a step above CRC, in this answer you could add | sha512sum before the ; done for example and get 'the worst of both worlds' :P


Similarly it's relatively easy to compare an archive and a real directory:

Or, ignoring files only in the directory, basically a handy dry-run of unzip -o -d "$directory":


Windows? Sorry. Whilst the scripts are simple and would be a cinch to port to the [syntactically] fantastic powershell, it wouldn't work. The native cmdlet only extracts to disk and MS still haven't fixed the broken binary data piping in PS so you cant "safely" use an external zip.exe in this manner either.

Apparenlty others have done similar things using the .NET API directly, but it'd become less of an elegant port and more of a reimplementation in .NET :|



A note about the "illegal filenames" mentioned before:
If you want it to work with these it actually isn't too difficult; you'll just need to swap $file with $(echo "$file" | sed 's/\\/\\\\/g;s/\^J/\n/g;s/\^M/\r/g').

Add other ctrl chars as you happen across them.

The reason is, for some reason, even though zipinfo displays a filename with \n in it as ^J, it will not accept these safe names for unzip, only the original! And even though it CAN extract to those illegal filenames with unzip -^ there's no way to get these original filenames through zipinfo at all. So you need to build the original, illegal filename from the safe, unusable one to reference them for the diff :(
If you do this, note that there is no way to distinguish between ^J literally and \n displaying as ^J, and that zip doesn't support / or ^@ within filenames at all.


As a bonus; you can write all these diffs straight to an archive and keep them all in a folder heirarchy matching the original files instead of trying to read it all at once in one big splat.

Not as pretty a script, but now you can open it up in your gui archiver of choice or do unzip -p diff.zip some/dir/some.file to see the differences with that file specifically, or be greeted with "not found" if there are no differences, which is much prettier in practice.

\ | sort \ | while IFS= read -r file; do unzip -c "$zip2" "$file"; done \ )

This decompresses in-memory, not to disk, releasing data from the pipe as it diffs (it wont decompress and then compare, so shouldn't use much memory).
Want to change diffing options for ignoring whitespace or using side-by-side? Change diff to diff -w or gvimdiff (this one will keep all files in memory) et cetera.
Say you only want to diff the .js files? Change * to *.js.
Only want to see the filenames that are missing from one or the other? Remove the while line and it wont bother decompressing.

Easy.

It will even safely handle (skip and record it to stderr) filenames with "illegal" characters like newlines and backslashes.
Doesn't get "safe"r than this.

slm's answer is pretty good for returning files that are different (without showing differences) and doesn't even decompress at all which is nice. If for some reason you want that but a step above CRC, in this answer you could add | sha512sum before the ; done for example and get 'the worst of both worlds' :P


Similarly it's relatively easy to compare an archive and a real directory:

Or, ignoring files only in the directory, basically a handy dry-run of unzip -o -d "$directory":


Windows? Sorry. Whilst the scripts are simple and would be a cinch to port to the [syntactically] fantastic powershell, it wouldn't work. The native cmdlet only extracts to disk and MS still haven't fixed the broken binary data piping in PS so you cant "safely" use an external zip.exe in this manner either.

Apparenlty others have done similar things using the .NET API directly, but it'd become less of an elegant port and more of a reimplementation in .NET :|



A note about the "illegal filenames" mentioned before:
If you want it to work with these it actually isn't too difficult; you'll just need to swap $file with $(echo "$file" | sed 's/\\/\\\\/g;s/\^J/\n/g;s/\^M/\r/g').

Add other ctrl chars as you happen across them.

The reason is, for some reason, even though zipinfo displays a filename with \n in it as ^J, it will not accept these safe names for unzip, only the original! And even though it CAN extract to those illegal filenames with unzip -^ there's no way to get these original filenames through zipinfo at all. So you need to build the original, illegal filename from the safe, unusable one to reference them for the diff :(
If you do this, note that there is no way to distinguish between ^J literally and \n displaying as ^J, and that zip doesn't support / or ^@ within filenames at all.


As a bonus; you can write all these diffs straight to an archive and keep them all in a folder heirarchy matching the original files instead of trying to read it all at once in one big splat.

Not as pretty a script, but now you can open it up in your gui archiver of choice or do unzip -p diff.zip some/dir/some.file to see the differences with that file specifically, or be greeted with "not found" if there are no differences, which is much prettier in practice.

∞梦里开花 2024-07-21 16:38:58

我通常使用类似 @mrabbit 的方法,但运行 2 个解压缩命令并根据需要比较输出。 例如,我需要比较 2 个 Java WAR 文件。

$ sdiff --width 160 \
   <(unzip -l -v my_num1.war | cut -c 1-9,59-,49-57 | sort -k3) \
   <(unzip -l -v my_num2.war | cut -c 1-9,59-,49-57 | sort -k3)

输出结果如下:

--------          -------                                                       --------          -------
Archive:                                                                        Archive:
-------- -------- ----                                                          -------- -------- ----
48619281          130 files                                                   | 51043693          130 files
    1116 060ccc56 index.jsp                                                         1116 060ccc56 index.jsp
       0 00000000 META-INF/                                                            0 00000000 META-INF/
     155 b50f41aa META-INF/MANIFEST.MF                                        |      155 701f1623 META-INF/MANIFEST.MF
 Length   CRC-32  Name                                                           Length   CRC-32  Name
    1179 b42096f1 version.jsp                                                       1179 b42096f1 version.jsp
       0 00000000 WEB-INF/                                                             0 00000000 WEB-INF/
       0 00000000 WEB-INF/classes/                                                     0 00000000 WEB-INF/classes/
       0 00000000 WEB-INF/classes/com/                                                 0 00000000 WEB-INF/classes/com/
...
...

I generally use an approach like @mrabbit's but run 2 unzip commands and diff the output as required. For example I need to compare 2 Java WAR files.

$ sdiff --width 160 \
   <(unzip -l -v my_num1.war | cut -c 1-9,59-,49-57 | sort -k3) \
   <(unzip -l -v my_num2.war | cut -c 1-9,59-,49-57 | sort -k3)

Resulting in output like so:

--------          -------                                                       --------          -------
Archive:                                                                        Archive:
-------- -------- ----                                                          -------- -------- ----
48619281          130 files                                                   | 51043693          130 files
    1116 060ccc56 index.jsp                                                         1116 060ccc56 index.jsp
       0 00000000 META-INF/                                                            0 00000000 META-INF/
     155 b50f41aa META-INF/MANIFEST.MF                                        |      155 701f1623 META-INF/MANIFEST.MF
 Length   CRC-32  Name                                                           Length   CRC-32  Name
    1179 b42096f1 version.jsp                                                       1179 b42096f1 version.jsp
       0 00000000 WEB-INF/                                                             0 00000000 WEB-INF/
       0 00000000 WEB-INF/classes/                                                     0 00000000 WEB-INF/classes/
       0 00000000 WEB-INF/classes/com/                                                 0 00000000 WEB-INF/classes/com/
...
...
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文