摘要::CRC32 与 Zlib

发布于 2024-12-22 12:08:39 字数 1342 浏览 0 评论 0原文

在我的代码中，我需要使用各种算法（包括 CRC32）对文件进行哈希处理。由于我还在 Digest 系列中使用其他加密哈希函数，因此我认为为它们保持一致的接口会很好。

作为记录，我确实找到了 digest-crc，这是一个宝石正是我想要的。问题是，Zlib 是标准库的一部分，并且有一个我想重用的 CRC32 的工作实现。此外，它是用 C 语言编写的，因此它应该提供比digest-crc 更优越的性能，digest-crc 是一个纯 ruby 实现。

实现 Digest::CRC32 实际上一开始看起来非常简单：

%w(digest zlib).each { |f| require f }

class Digest::CRC32 < Digest::Class
  include Digest::Instance

  def update(str)
    @crc32 = Zlib.crc32(str, @crc32)
  end

  def initialize; reset; end
  def reset; @crc32 = 0; end
  def finish; @crc32.to_s; end
end

一切看起来都正确：

crc32 = File.open('Rakefile') { |f| Zlib.crc32 f.read }
digest = Digest::CRC32.file('Rakefile').digest!.to_i
crc32 == digest
=> true

不幸的是，并非一切正常：

Digest::CRC32.file('Rakefile').hexdigest!
=> "313635393830353832"

# What I actually expected was:
Digest::CRC32.file('Rakefile').digest!.to_i.to_s(16)
=> "9e4a9a6"

hexdigest 基本上返回 Digest.hexencode(digest), 它适用于摘要的值字节级别。我不确定该函数是如何工作的，所以我想知道是否可以仅使用从 Zlib.crc32 返回的整数来实现此目的。

原文

In my code, I need to hash files using a variety of algorithms, including CRC32. Since I'm also using other cryptographic hash functions in the Digest family, I thought it would be nice to maintain a consistent interface for them all.

For the record, I did find digest-crc, a gem which does exactly what I want. The thing is, Zlib is part of the standard library and has a working implementation of CRC32 that I'd like to reuse. Also, it is written in C so it should offer superior performance in relation to digest-crc, which is a pure-ruby implementation.

Implementing Digest::CRC32 actually looked pretty straightforward at first:

%w(digest zlib).each { |f| require f }

class Digest::CRC32 < Digest::Class
  include Digest::Instance

  def update(str)
    @crc32 = Zlib.crc32(str, @crc32)
  end

  def initialize; reset; end
  def reset; @crc32 = 0; end
  def finish; @crc32.to_s; end
end

Everything looks right:

crc32 = File.open('Rakefile') { |f| Zlib.crc32 f.read }
digest = Digest::CRC32.file('Rakefile').digest!.to_i
crc32 == digest
=> true

Unfortunately, not everything works:

Digest::CRC32.file('Rakefile').hexdigest!
=> "313635393830353832"

# What I actually expected was:
Digest::CRC32.file('Rakefile').digest!.to_i.to_s(16)
=> "9e4a9a6"

hexdigest basically returns Digest.hexencode(digest), which works with the value of the digest at the byte level. I'm not sure how that function works, so I was wondering if it is possible to achieve this with just the integer returned from Zlib.crc32.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

别再吹冷风 2024-12-29 12:08:39

Digest 期望摘要返回构成校验和的原始字节，即在 crc32 的情况下，返回构成 32 位整数的 4 个字节。但是，您将返回一个字符串，其中包含该整数的以 10 为基数的表示形式。

您想要

[@crc32].pack('V')

将整数转换为表示该整数的字节。请阅读 pack 及其各种格式说明符 - 有很多方法来打包整数，具体取决于字节是否应以本机字节序、大字节序、小字节序等形式呈现，因此您应该弄清楚使用哪种格式一款符合您的需求

Digest is expecting digest to return the raw bytes that make up the checksum, i.e. in the case of a crc32 the 4 bytes that makeup that 32bit integer. However you are instead returning a string that contains the base 10 representation of that integer.

You want something like

[@crc32].pack('V')

to turn that integer into the bytes that represent that. Do go and read up on pack and its various format specifiers - there are lots of ways of packing an integer depending on whether the bytes should be presented in native endian-ness, big-endian, little-endian etc so you should figure out which one matches your needs

回复收藏 0 原文

可爱暴击 2024-12-29 12:08:39

抱歉，这并不能真正回答您的问题，但它可能会有所帮助。

首先，在读取文件时，请确保传递“rb”参数。我可以看到你不在 Windows 上，但是如果你的代码碰巧最终在 Windows 机器上运行，你的代码将不会以相同的方式工作，特别是在读取 ruby 文件时。示例：

crc32 = File.open('test.rb') { |f| Zlib.crc32 f.read }
#=> 189072290
digest = Digest::CRC32.file('test.rb').digest!.to_i
#=> 314435800
crc32 == digest
#=> false

crc32 = File.open('test.rb', "rb") { |f| Zlib.crc32 f.read }
#=> 314435800
digest = Digest::CRC32.file('test.rb').digest!.to_i
#=> 314435800
crc32 == digest
#=> true

上面的代码适用于所有平台，并且所有红宝石..我所知道的..
但这不是你问的..

我很确定上面示例中的 hexdigest 和摘要方法正在正常工作..

dig_file = Digest::CRC32.file('test.rb')

test1 = dig_file.hexdigest
#=> "333134343335383030"

test2 = dig_file.digest
#=> "314435800"

def hexdigest_to_digest(h)
  h.unpack('a2'*(h.size/2)).collect {|i| i.hex.chr }.join
end

test3 = hexdigest_to_digest(test1)
#=> "314435800"

所以我猜测 .to_i.to_s(16) 是否会超出你的预期结果或者你的预期结果可能是错误的？不确定，但祝一切顺利

Sorry this doesn't really answer your question but it might help..

Firstly, when reading in a file, make sure you pass the "rb" parameter. I can see you're not on windows but if by chance your code does end up getting ran on a windows machine your code won't work the same, especially when reading ruby files in. Example:

crc32 = File.open('test.rb') { |f| Zlib.crc32 f.read }
#=> 189072290
digest = Digest::CRC32.file('test.rb').digest!.to_i
#=> 314435800
crc32 == digest
#=> false

crc32 = File.open('test.rb', "rb") { |f| Zlib.crc32 f.read }
#=> 314435800
digest = Digest::CRC32.file('test.rb').digest!.to_i
#=> 314435800
crc32 == digest
#=> true

The above will work across all platforms and all rubies.. that I know of..
But that's not what you asked..

I'm pretty sure the hexdigest and digest methods in your above example are working as they should though..

dig_file = Digest::CRC32.file('test.rb')

test1 = dig_file.hexdigest
#=> "333134343335383030"

test2 = dig_file.digest
#=> "314435800"

def hexdigest_to_digest(h)
  h.unpack('a2'*(h.size/2)).collect {|i| i.hex.chr }.join
end

test3 = hexdigest_to_digest(test1)
#=> "314435800"

So I'm guessing either the .to_i.to_s(16) is throwing off your expected result or your expected result may possibly be wrong? Not sure, but all the best

回复收藏 0 原文