如何使用命令行工具DEFLATE来提取git对象?

发布于 2024-09-08 08:44:03 字数 329 浏览 7 评论 0原文

我正在寻找 DEFLATE 算法的命令行包装器。

我有一个使用 DEFLATE 压缩的文件 (git blob),我想解压缩它。 gzip命令似乎没有直接使用DEFLATE算法的选项,而不是gzip格式。

理想情况下,我正在寻找一个可以做到这一点的标准 Unix/Linux 工具。

编辑:这是我尝试使用 gzip 解决问题时得到的输出:

$ cat .git/objects/c0/fb67ab3fda7909000da003f4b2ce50a53f43e7 | gunzip

gzip: stdin: not in gzip format

I'm looking for a command line wrapper for the DEFLATE algorithm.

I have a file (git blob) that is compressed using DEFLATE, and I want to uncompress it. The gzip command does not seem to have an option to directly use the DEFLATE algorithm, rather than the gzip format.

Ideally I'm looking for a standard Unix/Linux tool that can do this.

edit: This is the output I get when trying to use gzip for my problem:

$ cat .git/objects/c0/fb67ab3fda7909000da003f4b2ce50a53f43e7 | gunzip

gzip: stdin: not in gzip format

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(22

迷爱 2024-09-15 08:44:03

您可以使用 OpenSSL 命令行工具执行此操作:

openssl zlib -d < $IN > $OUT

不幸的是,至少在 Ubuntu 上,zlib 子命令在默认构建配置中被禁用 (--no-zlib --no-zlib-dynamic),因此您需要从源代码编译 openssl 才能使用它。但例如,它在 Arch 上默认启用。

编辑:Arch 似乎不再支持 zlib 命令。这个答案可能不再有用:(

You can do this with the OpenSSL command line tool:

openssl zlib -d < $IN > $OUT

Unfortunately, at least on Ubuntu, the zlib subcommand is disabled in the default build configuration (--no-zlib --no-zlib-dynamic), so you would need to compile openssl from source to use it. But it is enabled by default on Arch, for example.

Edit: Seems like the zlib command is no longer supported on Arch either. This answer might not be useful anymore :(

等风也等你 2024-09-15 08:44:03

类似下面的内容将打印原始内容,包括“$type $length\0”标头:

perl -MCompress::Zlib -e 'undef $/; print uncompress(<>)' \
     < .git/objects/27/de0a1dd5a89a94990618632967a1c86a82d577

Something like the following will print the raw content, including the "$type $length\0" header:

perl -MCompress::Zlib -e 'undef $/; print uncompress(<>)' \
     < .git/objects/27/de0a1dd5a89a94990618632967a1c86a82d577
放血 2024-09-15 08:44:03

pythonic one-liner(针对 python3 文本和二进制数据之间的明显区别进行了更新):

gt; python -c "import zlib,sys;\
           sys.stdout.buffer.write(zlib.decompress(sys.stdin.buffer.read()))" < $IN

pythonic one-liner (updated for python3's sharp distinction between text and binary data):

gt; python -c "import zlib,sys;\
           sys.stdout.buffer.write(zlib.decompress(sys.stdin.buffer.read()))" < $IN
帅哥哥的热头脑 2024-09-15 08:44:03

更新:Mark Adler 指出 git blob 不是原始 DEFLATE 流,而是 zlib 流。这些可以通过 pigz 工具解压,该工具预先打包在多个 Linux 发行版中:

$ cat foo.txt 
file foo.txt!

$ git ls-files -s foo.txt
100644 7a79fc625cac65001fb127f468847ab93b5f8b19 0   foo.txt

$ pigz -d < .git/objects/7a/79fc625cac65001fb127f468847ab93b5f8b19 
blob 14file foo.txt!

由 kriegaex 编辑:Git Bash for Windows 用户会注意到 pigz< /em> 默认情况下不可用。您可以在此处找到预编译的 32/64 位版本。我尝试了 64 位版本,效果很好。例如,您可以将pigz.exe直接复制到c:\Program Files\Git\usr\bin,以便将其放在路径上。

由 mjaggard 编辑: Homebrew 和 Macports 都有 pigz 可用,因此您可以使用 brew install pigzsudo port install pigz 进行安装code>(如果您还没有,您可以按照其网站上的说明安装 Homebrew)


我原来的答案,出于历史原因保留:

如果我理解维基百科文章<中的提示< Marc van Kempen 提到的 /a> ,您可以使用 puff.c 直接来自 zlib

这是一个小例子:

#include <assert.h>
#include <string.h>
#include "puff.h"

int main( int argc, char **argv ) {
    unsigned char dest[ 5 ];
    unsigned long destlen = 4;
    const unsigned char *source = "\x4B\x2C\x4E\x49\x03\x00";
    unsigned long sourcelen = 6;    
    assert( puff( dest, &destlen, source, &sourcelen ) == 0 );
    dest[ 4 ] = '\0';
    assert( strcmp( dest, "asdf" ) == 0 );
}

UPDATE: Mark Adler noted that git blobs are not raw DEFLATE streams, but zlib streams. These can be unpacked by the pigz tool, which comes pre-packaged in several Linux distributions:

$ cat foo.txt 
file foo.txt!

$ git ls-files -s foo.txt
100644 7a79fc625cac65001fb127f468847ab93b5f8b19 0   foo.txt

$ pigz -d < .git/objects/7a/79fc625cac65001fb127f468847ab93b5f8b19 
blob 14file foo.txt!

Edit by kriegaex: Git Bash for Windows users will notice that pigz is unavailable by default. You can find precompiled 32/64-bit versions here. I tried the 64-bit version and it works nicely. You can e.g. copy pigz.exe directly to c:\Program Files\Git\usr\bin in order to put it on the path.

Edit by mjaggard: Homebrew and Macports both have pigz available so you can install with brew install pigz or sudo port install pigz (if you do not have it already, you can install Homebrew by following the instructions on their website)


My original answer, kept for historical reasons:

If I understand the hint in the Wikipedia article mentioned by Marc van Kempen, you can use puff.c from zlib directly.

This is a small example:

#include <assert.h>
#include <string.h>
#include "puff.h"

int main( int argc, char **argv ) {
    unsigned char dest[ 5 ];
    unsigned long destlen = 4;
    const unsigned char *source = "\x4B\x2C\x4E\x49\x03\x00";
    unsigned long sourcelen = 6;    
    assert( puff( dest, &destlen, source, &sourcelen ) == 0 );
    dest[ 4 ] = '\0';
    assert( strcmp( dest, "asdf" ) == 0 );
}
檐上三寸雪 2024-09-15 08:44:03

您可以使用 zlib-flate,如下所示:

cat .git/objects/c0/fb67ab3fda7909000da003f4b2ce50a53f43e7 \
    | zlib-flate -uncompress; echo

它默认存在于我的计算机上,但如果您需要安装它,它是 qpdf - 用于转换和检查 PDF 文件的工具 的一部分。

我在命令末尾弹出了一个 echo ,因为这样更容易读取输出。

You can use zlib-flate, like this:

cat .git/objects/c0/fb67ab3fda7909000da003f4b2ce50a53f43e7 \
    | zlib-flate -uncompress; echo

It's there by default on my machine, but it's part of qpdf - tools for and transforming and inspecting PDF files if you need to install it.

I've popped an echo on the end of the command, as it's easier to read the output that way.

送君千里 2024-09-15 08:44:03

尝试以下命令:

printf "\x1f\x8b\x08\x00\x00\x00\x00\x00" | cat - .git/objects/c0/fb67ab3fda7909000da003f4b2ce50a53f43e7 | gunzip

不需要外部工具。

来源:如何在 UNIX 中解压缩 zlib 数据? 在 Unix SE

Try the following command:

printf "\x1f\x8b\x08\x00\x00\x00\x00\x00" | cat - .git/objects/c0/fb67ab3fda7909000da003f4b2ce50a53f43e7 | gunzip

No external tools are needed.

Source: How to uncompress zlib data in UNIX? at unix SE

迷途知返 2024-09-15 08:44:03

这是一个 Ruby 单行代码(首先 cd .git/ 并识别任何对象的路径):

ruby -rzlib -e 'print Zlib::Inflate.new.inflate(STDIN.read)' < ./74/c757240ec596063af8cd273ebd9f67073e1208

Here is a Ruby one-liner ( cd .git/ first and identify path to any object ):

ruby -rzlib -e 'print Zlib::Inflate.new.inflate(STDIN.read)' < ./74/c757240ec596063af8cd273ebd9f67073e1208
美羊羊 2024-09-15 08:44:03

我厌倦了没有一个好的解决方案,所以我在 NPM 上放了一些东西:

https://github.com/ jezell/zlibber

现在可以通过管道来执行 inflate / deflate 命令。

I got tired of not having a good solution for this, so I put something on NPM:

https://github.com/jezell/zlibber

Now can just pipe to inflate / deflate command.

记忆で 2024-09-15 08:44:03

下面是在 Python 中打开提交对象的示例:

$ git show
commit 0972d7651ff85bedf464fba868c2ef434543916a
# all the junk in my commit...
$ python
>>> import zlib
>>> file = open(".git/objects/09/72d7651ff85bedf464fba868c2ef434543916a")
>>> data = file.read()
>>> print data
# binary garbage
>>> unzipped_data = zlib.decompress(data)
>>> print unzipped_data
# all the junk in my commit!

您将看到的内容与“git cat-file -p [hash]”的输出几乎相同,只是该命令不打印标题(后面跟着“commit”)由内容的大小和空字节)。

Here's a example of breaking open a commit object in Python:

$ git show
commit 0972d7651ff85bedf464fba868c2ef434543916a
# all the junk in my commit...
$ python
>>> import zlib
>>> file = open(".git/objects/09/72d7651ff85bedf464fba868c2ef434543916a")
>>> data = file.read()
>>> print data
# binary garbage
>>> unzipped_data = zlib.decompress(data)
>>> print unzipped_data
# all the junk in my commit!

What you will see there is almost identical to the output of 'git cat-file -p [hash]', except that command doesn't print the header ('commit' followed by the size of the content and a null byte).

嗼ふ静 2024-09-15 08:44:03

git 对象是通过 zlib 而不是 gzip 压缩的,因此要么使用 zlib 来解压缩它,要么使用 git 命令,即 git cat- file -p,打印内容。

git objects are compressed by zlib rather than gzip, so either using zlib to uncompress it, or git command, i.e. git cat-file -p <SHA1>, to print content.

Hello爱情风 2024-09-15 08:44:03

看起来 Mark Adler 已经想到了我们,并写了一个如何执行此操作的示例: http://www .zlib.net/zpipe.c

它只使用 gcc -lz 和安装的 zlib 标头进行编译。我在使用 git 内容时将生成的二进制文件复制到我的 /usr/local/bin/zpipe 中。

Looks like Mark Adler has us in mind and wrote an example of just how to do this with: http://www.zlib.net/zpipe.c

It compiles with nothing more than gcc -lz and the zlib headers installed. I copied the resulting binary to my /usr/local/bin/zpipe while working with git stuff.

回梦 2024-09-15 08:44:03
// save this as deflate.go

package main

import (
    "compress/zlib"
    "io"
    "os"
    "flag"
)

var infile = flag.String("f", "", "infile")

func main() {
    flag.Parse()
    file, _ := os.Open(*infile)

    r, err := zlib.NewReader(file)
    if err != nil {
        panic(err)
    }
    io.Copy(os.Stdout, r)

    r.Close()
}

$ go build deflate.go
$ ./deflate -f .git/objects/c0/fb67ab3fda7909000da003f4b2ce50a53f43e7
// save this as deflate.go

package main

import (
    "compress/zlib"
    "io"
    "os"
    "flag"
)

var infile = flag.String("f", "", "infile")

func main() {
    flag.Parse()
    file, _ := os.Open(*infile)

    r, err := zlib.NewReader(file)
    if err != nil {
        panic(err)
    }
    io.Copy(os.Stdout, r)

    r.Close()
}

$ go build deflate.go
$ ./deflate -f .git/objects/c0/fb67ab3fda7909000da003f4b2ce50a53f43e7
嘦怹 2024-09-15 08:44:03

pigz 可以做到:

apt-get install pigz
unpigz -c .git/objects/c0/fb67ab3fda7909000da003f4b2ce50a53f43e7

pigz can do it:

apt-get install pigz
unpigz -c .git/objects/c0/fb67ab3fda7909000da003f4b2ce50a53f43e7
海螺姑娘 2024-09-15 08:44:03

git 对象是 zlib 流(不是原始 deflate)。 pigz 将使用 -dz 选项解压缩这些文件。

git objects are zlib streams (not raw deflate). pigz will decompress those with the -dz option.

微暖i 2024-09-15 08:44:03

Python3 oneliner:

python3 -c "import zlib,sys; sys.stdout.buffer.write(zlib.decompress(sys.stdin.buffer.read()))" < infile > outfile

这种方式将内容作为二进制数据处理,避免与 unicode 之间的转换。

Python3 oneliner:

python3 -c "import zlib,sys; sys.stdout.buffer.write(zlib.decompress(sys.stdin.buffer.read()))" < infile > outfile

This way the contents is handled as binary data, avoiding conversion to/from unicode.

忘东忘西忘不掉你 2024-09-15 08:44:03

我多次遇到这个问题,似乎互联网上几乎所有的答案都是错误的,需要编译一些不太理想的代码,或者下载系统未跟踪的大量依赖项!但我找到了真正的解决方案。它使用 PERL,因为 PERL 在大多数系统上都很容易使用。

从类似 Bash 的 shell:

perl -mIO::Uncompress::RawInflate=rawinflate -erawinflate'"-","-"'

或者,如果您手动执行/fork(没有 shell 引号,但行分隔):

  • perl
  • -mIO::Uncompress::RawInflate=rawinflate
  • -erawinflate"-","-"

重要警告:如果流不是作为有效的 DEFLATE 流(例如,未压缩的数据)开始的,那么此命令将很高兴不受影响地通过管道传输所有数据。仅当流以有效的 DEFLATE 流开始(我想带有有效的字典?我不太确定......),然后这个命令会以某种方式出错。然而,在某些情况下这可能是理想的。

参考文献:

PERL IO::Uncompress::RawInflate::rawinflate

I have repeatedly come across this problem and it seems almost all of answers on the Internet are either wrong, require compiling some less than ideal code, or downloading a whole slew of dependencies untracked by the system! But I found a real solution. It uses PERL since PERL is readily available on most systems.

From a Bash-alike shell:

perl -mIO::Uncompress::RawInflate=rawinflate -erawinflate'"-","-"'

Or, if you're exec/fork-ing manually (without shell quotes, but line separated):

  • perl
  • -mIO::Uncompress::RawInflate=rawinflate
  • -erawinflate"-","-"

Big caveat: If the stream doesn't start off as a valid DEFLATE stream (such as say, uncompressed data), then this command will happily pipe all the data through untouched. Only if the stream begins as a valid DEFLATE stream (with a valid dictionary I suppose? I'm not too sure...), then this command will error somehow. In some situations this may be desirable however.

References:

PERL IO::Uncompress::RawInflate::rawinflate

夜吻♂芭芘 2024-09-15 08:44:03

请参阅 http://en.wikipedia.org/wiki/DEFLATE#Encoder_implementations

它列出了许多软件实现,包括 gzip,所以应该可以工作。您是否尝试在文件上运行 gzip ?它不会自动识别格式吗?

你怎么知道它是使用 DEFLATE 压缩的?使用什么工具来压缩文件?

See http://en.wikipedia.org/wiki/DEFLATE#Encoder_implementations

It lists a number of software implementations, including gzip, so that should work. Did you try just running gzip on the file? Does it not recognize the format automatically?

How do you know it is compressed using DEFLATE? What tool was used to compress the file?

沉鱼一梦 2024-09-15 08:44:03

为什么不直接使用 git 的工具来访问数据呢?这应该能够读取任何 git 对象:

git show --pretty=raw <object SHA-1>

Why don't you just use git's tools to access the data? This should be able to read any git object:

git show --pretty=raw <object SHA-1>
梦境 2024-09-15 08:44:03

这就是我使用 Powershell 的方法。

$fs = New-Object IO.FileStream((Resolve-Path $Path), [IO.FileMode]::Open, [IO.FileAccess]::Read)
$fs.Position = 2
$cs = New-Object IO.Compression.DeflateStream($fs, [IO.Compression.CompressionMode]::Decompress)
$sr = New-Object IO.StreamReader($cs)
$sr.ReadToEnd()

然后,您可以创建一个别名,例如:

function func_deflate{
    param(
        [Parameter(Mandatory=$true, ValueFromPipeline = $true)]
        [ValidateScript({Test-Path $_ -PathType leaf})]
        [string]$Path
    )
    $ErrorActionPreference = 'Stop'    
    $fs = New-Object IO.FileStream((Resolve-Path $Path), [IO.FileMode]::Open, [IO.FileAccess]::Read)
    $fs.Position = 2
    $cs = New-Object IO.Compression.DeflateStream($fs, [IO.Compression.CompressionMode]::Decompress)
    $sr = New-Object IO.StreamReader($cs)
    return $sr.ReadToEnd()
}

Set-Alias -Name deflate -Value func_deflate

在此处输入图像描述

This is how I do it with Powershell.

$fs = New-Object IO.FileStream((Resolve-Path $Path), [IO.FileMode]::Open, [IO.FileAccess]::Read)
$fs.Position = 2
$cs = New-Object IO.Compression.DeflateStream($fs, [IO.Compression.CompressionMode]::Decompress)
$sr = New-Object IO.StreamReader($cs)
$sr.ReadToEnd()

You can then create an alias like:

function func_deflate{
    param(
        [Parameter(Mandatory=$true, ValueFromPipeline = $true)]
        [ValidateScript({Test-Path $_ -PathType leaf})]
        [string]$Path
    )
    $ErrorActionPreference = 'Stop'    
    $fs = New-Object IO.FileStream((Resolve-Path $Path), [IO.FileMode]::Open, [IO.FileAccess]::Read)
    $fs.Position = 2
    $cs = New-Object IO.Compression.DeflateStream($fs, [IO.Compression.CompressionMode]::Decompress)
    $sr = New-Object IO.StreamReader($cs)
    return $sr.ReadToEnd()
}

Set-Alias -Name deflate -Value func_deflate

enter image description here

枕花眠 2024-09-15 08:44:03

我发现这个问题正在寻找我刚刚安装的新版本的 hadoop dfs 客户端中的 -text 实用程序错误的解决方法。 -text 实用程序的工作方式与 cat 类似,只不过如果读取的文件是压缩的,它会透明地解压缩并输出纯文本(因此得名)。

已经发布的答案肯定很有帮助,但其中一些答案在处理 Hadoop 大小的数据量时存在一个问题 - 他们在解压缩之前将所有内容读入内存。

因此,以下是我对上面的 PerlPython 答案的变体,它们没有该限制:

Python:

hadoop fs -cat /path/to/example.deflate |
  python -c 'import zlib,sys;map(lambda b:sys.stdout.write(zlib.decompress(b)),iter(lambda:sys.stdin.read(4096),""))'

Perl:

hadoop fs -cat /path/to/example.deflate |
  perl -MCompress::Zlib -e 'print uncompress($buf) while sysread(STDIN,$buf,4096)'

注意 -cat 的使用code> 子命令,而不是 -text。这样我的解决方法就不会在他们修复错误后中断。对 python 版本的可读性表示歉意。

I found this question looking for a work-around with a bug with the -text utility in the new version of the hadoop dfs client I just installed. The -text utility works like cat, except if the file being read is compressed, it transparently decompresses and outputs the plain-text (hence the name).

The answers already posted were definitely helpful, but some of them have one problem when dealing with Hadoop-sized amounts of data - they read everything into memory before decompressing.

So, here are my variations on the Perl and Python answers above that do not have that limitation:

Python:

hadoop fs -cat /path/to/example.deflate |
  python -c 'import zlib,sys;map(lambda b:sys.stdout.write(zlib.decompress(b)),iter(lambda:sys.stdin.read(4096),""))'

Perl:

hadoop fs -cat /path/to/example.deflate |
  perl -MCompress::Zlib -e 'print uncompress($buf) while sysread(STDIN,$buf,4096)'

Note the use of the -cat sub-command, instead of -text. This is so that my work-around does not break after they've fixed the bug. Apologies for the readability of the python version.

牛↙奶布丁 2024-09-15 08:44:03

为了添加到这个集合中,这里有用于 deflate/inflate/raw deflate/raw inflate 的 perl 单行代码。

放气

perl -MIO::Compress::Deflate -e 'undef $/; my ($in, $out) = (<>, undef); IO::Compress::Deflate::deflate(\$in, \$out); print $out;'

充气

perl -MIO::Uncompress::Inflate -e 'undef $/; my ($in, $out) = (<>, undef); IO::Uncompress::Inflate::inflate(\$in, \$out); print $out;'

原始放气

perl -MIO::Compress::RawDeflate -e 'undef $/; my ($in, $out) = (<>, undef); IO::Compress::RawDeflate::rawdeflate(\$in, \$out); print $out;'

原始充气

perl -MIO::Uncompress::RawInflate -e 'undef $/; my ($in, $out) = (<>, undef); IO::Uncompress::RawInflate::rawinflate(\$in, \$out); print $out;'

To add to the collection, here are perl one-liners for deflate/inflate/raw deflate/raw inflate.

Deflate

perl -MIO::Compress::Deflate -e 'undef $/; my ($in, $out) = (<>, undef); IO::Compress::Deflate::deflate(\$in, \$out); print $out;'

Inflate

perl -MIO::Uncompress::Inflate -e 'undef $/; my ($in, $out) = (<>, undef); IO::Uncompress::Inflate::inflate(\$in, \$out); print $out;'

Raw deflate

perl -MIO::Compress::RawDeflate -e 'undef $/; my ($in, $out) = (<>, undef); IO::Compress::RawDeflate::rawdeflate(\$in, \$out); print $out;'

Raw inflate

perl -MIO::Uncompress::RawInflate -e 'undef $/; my ($in, $out) = (<>, undef); IO::Uncompress::RawInflate::rawinflate(\$in, \$out); print $out;'
帥小哥 2024-09-15 08:44:03
const zlib = require("zlib");
const adler32 = require("adler32");
const data = "hello world~!";
const chksum = adler32.sum(new Buffer(data)).toString(16);
console.log("789c",zlib.deflateRawSync(data).toString("hex"),chksum);
// or
console.log(zlib.deflateSync(data).toString("hex"));
const zlib = require("zlib");
const adler32 = require("adler32");
const data = "hello world~!";
const chksum = adler32.sum(new Buffer(data)).toString(16);
console.log("789c",zlib.deflateRawSync(data).toString("hex"),chksum);
// or
console.log(zlib.deflateSync(data).toString("hex"));
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文