在 Ruby 中创建数字、字符串、数组或哈希的 md5 哈希

发布于 2024-11-16 16:50:40 字数 570 浏览 1 评论 0 原文

我需要在 Ruby 中为变量创建签名字符串,其中变量可以是数字、字符串、哈希值或数组。哈希值和数组元素也可以是这些类型中的任何一种。

该字符串将用于比较数据库(在本例中为 Mongo)中的值。

我的第一个想法是创建 JSON 编码值的 MD5 哈希,如下所示:(body 是上面提到的变量)

def createsig(body)    
  Digest::MD5.hexdigest(JSON.generate(body))
end

这几乎可以工作,但是 JSON.generate 不会每次都以相同的顺序对哈希的键进行编码,所以 createsig({:a=>'a',:b=>'b'}) 并不总是等于createsig({:b=>'b',:a=>'a'})

创建签名字符串来满足此需求的最佳方法是什么?

注意:对于我们之间的细节,我知道你不能 JSON.generate() 数字或字符串。在这些情况下,我只需直接调用 MD5.hexdigest() 即可。

I need to create a signature string for a variable in Ruby, where the variable can be a number, a string, a hash, or an array. The hash values and array elements can also be any of these types.

This string will be used to compare the values in a database (Mongo, in this case).

My first thought was to create an MD5 hash of a JSON encoded value, like so: (body is the variable referred to above)

def createsig(body)    
  Digest::MD5.hexdigest(JSON.generate(body))
end

This nearly works, but JSON.generate does not encode the keys of a hash in the same order each time, so createsig({:a=>'a',:b=>'b'}) does not always equal createsig({:b=>'b',:a=>'a'}).

What is the best way to create a signature string to fit this need?

Note: For the detail oriented among us, I know that you can't JSON.generate() a number or a string. In these cases, I would just call MD5.hexdigest() directly.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

回眸一笑 2024-11-23 16:50:40

我很快就编写了以下代码,没有时间在工作中真正测试它,但它应该可以完成工作。如果您发现任何问题,请告诉我,我会查看。

这应该正确地展平并对数组和散列进行排序,并且您需要一些看起来非常奇怪的字符串才能避免任何冲突。

def createsig(body)
  Digest::MD5.hexdigest( sigflat body )
end

def sigflat(body)
  if body.class == Hash
    arr = []
    body.each do |key, value|
      arr << "#{sigflat key}=>#{sigflat value}"
    end
    body = arr
  end
  if body.class == Array
    str = ''
    body.map! do |value|
      sigflat value
    end.sort!.each do |value|
      str << value
    end
  end
  if body.class != String
    body = body.to_s << body.class.to_s
  end
  body
end

> sigflat({:a => {:b => 'b', :c => 'c'}, :d => 'd'}) == sigflat({:d => 'd', :a => {:c => 'c', :b => 'b'}})
=> true

I coding up the following pretty quickly and don't have time to really test it here at work, but it ought to do the job. Let me know if you find any issues with it and I'll take a look.

This should properly flatten out and sort the arrays and hashes, and you'd need to have to some pretty strange looking strings for there to be any collisions.

def createsig(body)
  Digest::MD5.hexdigest( sigflat body )
end

def sigflat(body)
  if body.class == Hash
    arr = []
    body.each do |key, value|
      arr << "#{sigflat key}=>#{sigflat value}"
    end
    body = arr
  end
  if body.class == Array
    str = ''
    body.map! do |value|
      sigflat value
    end.sort!.each do |value|
      str << value
    end
  end
  if body.class != String
    body = body.to_s << body.class.to_s
  end
  body
end

> sigflat({:a => {:b => 'b', :c => 'c'}, :d => 'd'}) == sigflat({:d => 'd', :a => {:c => 'c', :b => 'b'}})
=> true
场罚期间 2024-11-23 16:50:40

如果您只能获取 body 的字符串表示形式,并且每次都不能以不同的顺序返回 Ruby 1.8 哈希值,那么您就可以可靠地对该字符串表示形式进行哈希处理。让我们开始使用一些猴子补丁:

require 'digest/md5'

class Object
  def md5key
    to_s
  end
end

class Array
  def md5key
    map(&:md5key).join
  end
end

class Hash
  def md5key
    sort.map(&:md5key).join
  end
end

现在任何对象(问题中提到的类型)都会通过返回用于创建校验和的可靠密钥来响应 md5key,因此:

def createsig(o)
  Digest::MD5.hexdigest(o.md5key)
end

示例:

body = [
  {
    'bar' => [
      345,
      "baz",
    ],
    'qux' => 7,
  },
  "foo",
  123,
]
p body.md5key        # => "bar345bazqux7foo123"
p createsig(body)    # => "3a92036374de88118faf19483fe2572e"

注意:该哈希表示不编码结构,仅编码值的串联。因此 ["a", "b", "c"] 的哈希值与 ["abc"] 相同。

If you could only get a string representation of body and not have the Ruby 1.8 hash come back with different orders from one time to the other, you could reliably hash that string representation. Let's get our hands dirty with some monkey patches:

require 'digest/md5'

class Object
  def md5key
    to_s
  end
end

class Array
  def md5key
    map(&:md5key).join
  end
end

class Hash
  def md5key
    sort.map(&:md5key).join
  end
end

Now any object (of the types mentioned in the question) respond to md5key by returning a reliable key to use for creating a checksum, so:

def createsig(o)
  Digest::MD5.hexdigest(o.md5key)
end

Example:

body = [
  {
    'bar' => [
      345,
      "baz",
    ],
    'qux' => 7,
  },
  "foo",
  123,
]
p body.md5key        # => "bar345bazqux7foo123"
p createsig(body)    # => "3a92036374de88118faf19483fe2572e"

Note: This hash representation does not encode the structure, only the concatenation of the values. Therefore ["a", "b", "c"] will hash the same as ["abc"].

慢慢从新开始 2024-11-23 16:50:40

这是我的解决方案。我遍历数据结构并构建一个连接成单个字符串的片段列表。为了确保所看到的类类型影响哈希值,我注入了一个 unicode 字符,该字符一路上对基本类型信息进行编码。 (例如,我们想要 ["1", "2", "3"].objsum != [1,2,3].objsum)

我这样做是为了对 Object 进行改进,它很容易移植到猴子补丁中。要使用它,只需需要该文件并运行“using ObjSum”。

module ObjSum
  refine Object do
    def objsum
      parts = []
      queue = [self]

      while queue.size > 0
        item = queue.shift

        if item.kind_of?(Hash)
          parts << "\\000"
          item.keys.sort.each do |k| 
            queue << k
            queue << item[k]
          end
        elsif item.kind_of?(Set)
          parts << "\\001"
          item.to_a.sort.each { |i| queue << i }
        elsif item.kind_of?(Enumerable)
          parts << "\\002"
          item.each { |i| queue << i }
        elsif item.kind_of?(Fixnum)
          parts << "\\003"
          parts << item.to_s
        elsif item.kind_of?(Float)
          parts << "\\004"
          parts << item.to_s
        else
          parts << item.to_s
        end
      end

      Digest::MD5.hexdigest(parts.join)
    end
  end
end

Here's my solution. I walk the data structure and build up a list of pieces that get joined into a single string. In order to ensure that the class types seen affect the hash, I inject a single unicode character that encodes basic type information along the way. (For example, we want ["1", "2", "3"].objsum != [1,2,3].objsum)

I did this as a refinement on Object, it's easily ported to a monkey patch. To use it just require the file and run "using ObjSum".

module ObjSum
  refine Object do
    def objsum
      parts = []
      queue = [self]

      while queue.size > 0
        item = queue.shift

        if item.kind_of?(Hash)
          parts << "\\000"
          item.keys.sort.each do |k| 
            queue << k
            queue << item[k]
          end
        elsif item.kind_of?(Set)
          parts << "\\001"
          item.to_a.sort.each { |i| queue << i }
        elsif item.kind_of?(Enumerable)
          parts << "\\002"
          item.each { |i| queue << i }
        elsif item.kind_of?(Fixnum)
          parts << "\\003"
          parts << item.to_s
        elsif item.kind_of?(Float)
          parts << "\\004"
          parts << item.to_s
        else
          parts << item.to_s
        end
      end

      Digest::MD5.hexdigest(parts.join)
    end
  end
end
转瞬即逝 2024-11-23 16:50:40

只是我的 2 美分:

module Ext
  module Hash
    module InstanceMethods
      # Return a string suitable for generating content signature.
      # Signature image does not depend on order of keys.
      #
      #   {:a => 1, :b => 2}.signature_image == {:b => 2, :a => 1}.signature_image                  # => true
      #   {{:a => 1, :b => 2} => 3}.signature_image == {{:b => 2, :a => 1} => 3}.signature_image    # => true
      #   etc.
      #
      # NOTE: Signature images of identical content generated under different versions of Ruby are NOT GUARANTEED to be identical.
      def signature_image
        # Store normalized key-value pairs here.
        ar = []

        each do |k, v|
          ar << [
            k.is_a?(::Hash) ? k.signature_image : [k.class.to_s, k.inspect].join(":"),
            v.is_a?(::Hash) ? v.signature_image : [v.class.to_s, v.inspect].join(":"),
          ]
        end

        ar.sort.inspect
      end
    end
  end
end

class Hash    #:nodoc:
  include Ext::Hash::InstanceMethods
end

Just my 2 cents:

module Ext
  module Hash
    module InstanceMethods
      # Return a string suitable for generating content signature.
      # Signature image does not depend on order of keys.
      #
      #   {:a => 1, :b => 2}.signature_image == {:b => 2, :a => 1}.signature_image                  # => true
      #   {{:a => 1, :b => 2} => 3}.signature_image == {{:b => 2, :a => 1} => 3}.signature_image    # => true
      #   etc.
      #
      # NOTE: Signature images of identical content generated under different versions of Ruby are NOT GUARANTEED to be identical.
      def signature_image
        # Store normalized key-value pairs here.
        ar = []

        each do |k, v|
          ar << [
            k.is_a?(::Hash) ? k.signature_image : [k.class.to_s, k.inspect].join(":"),
            v.is_a?(::Hash) ? v.signature_image : [v.class.to_s, v.inspect].join(":"),
          ]
        end

        ar.sort.inspect
      end
    end
  end
end

class Hash    #:nodoc:
  include Ext::Hash::InstanceMethods
end
怎言笑 2024-11-23 16:50:40

如今,有一个正式定义的 JSON 规范化方法,正是出于这个原因: https://datatracker.ietf.org/doc/html/draft-rundgren-json-canonicalization-scheme-16

有一个ruby 实现在这里: https://github.com/dryruby/json-canonicalization

These days there is a formally defined method for canonicalizing JSON, for exactly this reason: https://datatracker.ietf.org/doc/html/draft-rundgren-json-canonicalization-scheme-16

There is a ruby implementation here: https://github.com/dryruby/json-canonicalization

長街聽風 2024-11-23 16:50:40

根据您的需要,您甚至可以调用 ary.inspect 或 ary.to_yaml 。

Depending on your needs, you could call ary.inspect or ary.to_yaml, even.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文