在 Ruby 中深度复制对象最有效的方法是什么?

发布于 2024-11-01 16:00:21 字数 382 浏览 6 评论 0原文

我知道序列化对象(据我所知)是有效深度复制对象的唯一方法(只要它不像 IO 之类的有状态),但这是一种特别多的方法比另一个更有效率?

例如,由于我使用的是 Rails,所以我总是可以使用 ActiveSupport::JSON、to_xml - 据我所知,编组对象是最受接受的方法之一方法来做到这一点。我预计编组可能是其中最有效的,因为它是 Ruby 内部的,但我是否遗漏了什么?

编辑:请注意,它的实现是我已经介绍过的 - 我不想替换现有的浅复制方法(例如 dupclone ),所以我最终可能会添加 Object::deep_copy ,其结果是上述方法(或您有的任何建议:)中开销最小的方法。

I know that serializing an object is (to my knowledge) the only way to effectively deep-copy an object (as long as it isn't stateful like IO and whatnot), but is one way particularly more efficient than another?

For example, since I'm using Rails, I could always use ActiveSupport::JSON, to_xml - and from what I can tell marshalling the object is one of the most accepted ways to do this. I'd expect that marshalling is probably the most efficient of these since it's a Ruby internal, but am I missing anything?

Edit: note that its implementation is something I already have covered - I don't want to replace existing shallow copy methods (like dup and clone), so I'll just end up likely adding Object::deep_copy, the result of which being whichever of the above methods (or any suggestions you have :) that has the least overhead.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

草莓酥 2024-11-08 16:00:21

我也想知道同样的事情,所以我对几种不同的技术进行了相互比较。我主要关心数组和哈希 - 我没有测试任何复杂的对象。也许毫不奇怪,定制的深度克隆实现被证明是最快的。如果您正在寻求快速、轻松的实施,Marshal 似乎是您的最佳选择。

我还使用 Rails 3.0.7 对 XML 解决方案进行了基准测试,如下所示。它慢得多,慢得多,大约 10 秒,仅进行 1000 次迭代(下面的解决方案都在基准测试中运行了 10,000 次)。

关于我的 JSON 解决方案的两个注释。首先,我使用了 C 变体,版本 1.4.3。其次,它实际上并不能 100% 工作,因为符号将被转换为字符串。

这一切都是用 ruby​​ 1.9.2p180 运行的。

#!/usr/bin/env ruby
require 'benchmark'
require 'yaml'
require 'json/ext'
require 'msgpack'

def dc1(value)
  Marshal.load(Marshal.dump(value))
end

def dc2(value)
  YAML.load(YAML.dump(value))
end

def dc3(value)
  JSON.load(JSON.dump(value))
end

def dc4(value)
  if value.is_a?(Hash)
    result = value.clone
    value.each{|k, v| result[k] = dc4(v)}
    result
  elsif value.is_a?(Array)
    result = value.clone
    result.clear
    value.each{|v| result << dc4(v)}
    result
  else
    value
  end
end

def dc5(value)
  MessagePack.unpack(value.to_msgpack)
end

value = {'a' => {:x => [1, [nil, 'b'], {'a' => 1}]}, 'b' => ['z']}

Benchmark.bm do |x|
  iterations = 10000
  x.report {iterations.times {dc1(value)}}
  x.report {iterations.times {dc2(value)}}
  x.report {iterations.times {dc3(value)}}
  x.report {iterations.times {dc4(value)}}
  x.report {iterations.times {dc5(value)}}
end

结果是:

user       system     total       real
0.230000   0.000000   0.230000 (  0.239257)  (Marshal)
3.240000   0.030000   3.270000 (  3.262255)  (YAML) 
0.590000   0.010000   0.600000 (  0.601693)  (JSON)
0.060000   0.000000   0.060000 (  0.067661)  (Custom)
0.090000   0.010000   0.100000 (  0.097705)  (MessagePack)

I was wondering the same thing, so I benchmarked a few different techniques against each other. I was primarily concerned with Arrays and Hashes - I didn't test any complex objects. Perhaps unsurprisingly, a custom deep-clone implementation proved to be the fastest. If you are looking for quick and easy implementation, Marshal appears to be the way to go.

I also benchmarked an XML solution with Rails 3.0.7, not shown below. It was much, much slower, ~10 seconds for only 1000 iterations (the solutions below all ran 10,000 times for the benchmark).

Two notes regarding my JSON solution. First, I used the C variant, version 1.4.3. Second, it doesn't actually work 100%, as symbols will be converted to Strings.

This was all run with ruby 1.9.2p180.

#!/usr/bin/env ruby
require 'benchmark'
require 'yaml'
require 'json/ext'
require 'msgpack'

def dc1(value)
  Marshal.load(Marshal.dump(value))
end

def dc2(value)
  YAML.load(YAML.dump(value))
end

def dc3(value)
  JSON.load(JSON.dump(value))
end

def dc4(value)
  if value.is_a?(Hash)
    result = value.clone
    value.each{|k, v| result[k] = dc4(v)}
    result
  elsif value.is_a?(Array)
    result = value.clone
    result.clear
    value.each{|v| result << dc4(v)}
    result
  else
    value
  end
end

def dc5(value)
  MessagePack.unpack(value.to_msgpack)
end

value = {'a' => {:x => [1, [nil, 'b'], {'a' => 1}]}, 'b' => ['z']}

Benchmark.bm do |x|
  iterations = 10000
  x.report {iterations.times {dc1(value)}}
  x.report {iterations.times {dc2(value)}}
  x.report {iterations.times {dc3(value)}}
  x.report {iterations.times {dc4(value)}}
  x.report {iterations.times {dc5(value)}}
end

results in:

user       system     total       real
0.230000   0.000000   0.230000 (  0.239257)  (Marshal)
3.240000   0.030000   3.270000 (  3.262255)  (YAML) 
0.590000   0.010000   0.600000 (  0.601693)  (JSON)
0.060000   0.000000   0.060000 (  0.067661)  (Custom)
0.090000   0.010000   0.100000 (  0.097705)  (MessagePack)
稀香 2024-11-08 16:00:21

我认为您需要向您正在复制的类添加一个initialize_copy方法。然后将深拷贝的逻辑放在那里。然后,当您调用克隆时,它将触发该方法。我没做过,但这是我的理解。

我认为 B 计划只是覆盖克隆方法:

class CopyMe
    attr_accessor :var
    def initialize var=''
      @var = var
    end    
    def clone deep= false
      deep ? CopyMe.new(@var.clone) : CopyMe.new()
    end
end

a = CopyMe.new("test")  
puts "A: #{a.var}"
b = a.clone
puts "B: #{b.var}"
c = a.clone(true)
puts "C: #{c.var}"

输出

mike@sleepycat:~/projects$ ruby ~/Desktop/clone.rb 
A: test
B: 
C: test

我相信你可以通过一些修补让它变得更酷,但无论好坏,这可能就是我会做的。

I think you need to add an initialize_copy method to the class you are copying. Then put the logic for the deep copy in there. Then when you call clone it will fire that method. I haven't done it but that's my understanding.

I think plan B would be just overriding the clone method:

class CopyMe
    attr_accessor :var
    def initialize var=''
      @var = var
    end    
    def clone deep= false
      deep ? CopyMe.new(@var.clone) : CopyMe.new()
    end
end

a = CopyMe.new("test")  
puts "A: #{a.var}"
b = a.clone
puts "B: #{b.var}"
c = a.clone(true)
puts "C: #{c.var}"

Output

mike@sleepycat:~/projects$ ruby ~/Desktop/clone.rb 
A: test
B: 
C: test

I'm sure you could make that cooler with a little tinkering but for better or for worse that is probably how I would do it.

腻橙味 2024-11-08 16:00:21

Ruby 不包含深层克隆的原因可能与问题的复杂性有关。请参阅最后的注释。

要制作将“深度复制”、哈希、数组和元素值的克隆,即复制原始文件中的每个元素,以便副本将具有相同的值,但有新的对象,您可以使用以下命令

class Object
  def deepclone
    case
    when self.class==Hash
      hash = {}
      self.each { |k,v| hash[k] = v.deepclone }
      hash
    when self.class==Array
      array = []
      self.each { |v| array << v.deepclone }
      array
    else
      if defined?(self.class.new)
        self.class.new(self)
      else
        self
      end
    end
  end
end

:如果你想重新定义 Ruby 的 clone 方法的行为,你可以将其命名为 clone 而不是 deepclone (在 3 个地方),但我有不知道重新定义 Ruby 的克隆行为将如何影响 Ruby 库或 Ruby on Rails,所以请自负。就我个人而言,我不建议这样做。

例如:

a = {'a'=>'x','b'=>'y'}                          => {"a"=>"x", "b"=>"y"}
b = a.deepclone                                  => {"a"=>"x", "b"=>"y"}
puts "#{a['a'].object_id} / #{b['a'].object_id}" => 15227640 / 15209520

如果您希望您的类正确地深度克隆,它们的new方法(初始化)必须能够以标准方式深度克隆该类的对象,即,如果给出第一个参数后,假定它是要深度克隆的对象。

例如,假设我们想要一个 M 类。第一个参数必须是 M 类的可选对象。这里我们有第二个可选参数 z 来预先设置新对象中 z 的值。

class M
  attr_accessor :z
  def initialize(m=nil, z=nil)
    if m
      # deepclone all the variables in m to the new object
      @z = m.z.deepclone
    else
      # default all the variables in M
      @z = z # default is nil if not specified
    end
  end
end

此处克隆期间将忽略 z 预设,但您的方法可能有不同的行为。此类的对象将如下创建:

# a new 'plain vanilla' object of M
m=M.new                                        => #<M:0x0000000213fd88 @z=nil>
# a new object of M with m.z pre-set to 'g'
m=M.new(nil,'g')                               => #<M:0x00000002134ca8 @z="g">
# a deepclone of m in which the strings are the same value, but different objects
n=m.deepclone                                  => #<M:0x00000002131d00 @z="g">
puts "#{m.z.object_id} / #{n.z.object_id}" => 17409660 / 17403500

其中类 M 的对象是数组的一部分:

a = {'a'=>M.new(nil,'g'),'b'=>'y'}               => {"a"=>#<M:0x00000001f8bf78 @z="g">, "b"=>"y"}
b = a.deepclone                                  => {"a"=>#<M:0x00000001766f28 @z="g">, "b"=>"y"}
puts "#{a['a'].object_id} / #{b['a'].object_id}" => 12303600 / 12269460
puts "#{a['b'].object_id} / #{b['b'].object_id}" => 16811400 / 17802280

注意:

  • 如果 deepclone 尝试克隆一个不以标准方式克隆自身的对象,它可能会失败。
  • 如果deepclone尝试克隆一个可以以标准方式克隆自身的对象,并且如果它是一个复杂的结构,那么它可能(并且可能会)对自身进行浅克隆。
  • deepclone 不会深度复制哈希中的键。原因是它们通常不被视为数据,但如果将 hash[k] 更改为 hash[k.deepclone],它们也将被深度复制。
  • 某些元素值没有 new 方法,例如 Fixnum。这些对象始终具有相同的对象 ID,并且是复制的,而不是克隆的。
  • 请小心,因为当您进行深度复制时,原始哈希或数组中包含相同对象的两个部分将在深度克隆中包含不同的对象。

Probably the reason Ruby doesn't contain a deep clone has to do with the complexity of the problem. See the notes at the end.

To make a clone that will "deep copy," Hashes, Arrays, and elemental values, i.e., make a copy of each element in the original such that the copy will have the same values, but new objects, you can use this:

class Object
  def deepclone
    case
    when self.class==Hash
      hash = {}
      self.each { |k,v| hash[k] = v.deepclone }
      hash
    when self.class==Array
      array = []
      self.each { |v| array << v.deepclone }
      array
    else
      if defined?(self.class.new)
        self.class.new(self)
      else
        self
      end
    end
  end
end

If you want to redefine the behavior of Ruby's clone method , you can name it just clone instead of deepclone (in 3 places), but I have no idea how redefining Ruby's clone behavior will affect Ruby libraries, or Ruby on Rails, so Caveat Emptor. Personally, I can't recommend doing that.

For example:

a = {'a'=>'x','b'=>'y'}                          => {"a"=>"x", "b"=>"y"}
b = a.deepclone                                  => {"a"=>"x", "b"=>"y"}
puts "#{a['a'].object_id} / #{b['a'].object_id}" => 15227640 / 15209520

If you want your classes to deepclone properly, their new method (initialize) must be able to deepclone an object of that class in the standard way, i.e., if the first parameter is given, it's assumed to be an object to be deepcloned.

Suppose we want a class M, for example. The first parameter must be an optional object of class M. Here we have a second optional argument z to pre-set the value of z in the new object.

class M
  attr_accessor :z
  def initialize(m=nil, z=nil)
    if m
      # deepclone all the variables in m to the new object
      @z = m.z.deepclone
    else
      # default all the variables in M
      @z = z # default is nil if not specified
    end
  end
end

The z pre-set is ignored during cloning here, but your method may have a different behavior. Objects of this class would be created like this:

# a new 'plain vanilla' object of M
m=M.new                                        => #<M:0x0000000213fd88 @z=nil>
# a new object of M with m.z pre-set to 'g'
m=M.new(nil,'g')                               => #<M:0x00000002134ca8 @z="g">
# a deepclone of m in which the strings are the same value, but different objects
n=m.deepclone                                  => #<M:0x00000002131d00 @z="g">
puts "#{m.z.object_id} / #{n.z.object_id}" => 17409660 / 17403500

Where objects of class M are part of an array:

a = {'a'=>M.new(nil,'g'),'b'=>'y'}               => {"a"=>#<M:0x00000001f8bf78 @z="g">, "b"=>"y"}
b = a.deepclone                                  => {"a"=>#<M:0x00000001766f28 @z="g">, "b"=>"y"}
puts "#{a['a'].object_id} / #{b['a'].object_id}" => 12303600 / 12269460
puts "#{a['b'].object_id} / #{b['b'].object_id}" => 16811400 / 17802280

Notes:

  • If deepclone tries to clone an object which doesn't clone itself in the standard way, it may fail.
  • If deepclone tries to clone an object which can clone itself in the standard way, and if it is a complex structure, it may (and probably will) make a shallow clone of itself.
  • deepclone doesn't deep copy the keys in the Hashes. The reason is that they are not usually treated as data, but if you change hash[k] to hash[k.deepclone] they will also be deep copied also.
  • Certain elemental values have no new method, such as Fixnum. These objects always have the same object ID, and are copied, not cloned.
  • Be careful because when you deep copy, two parts of your Hash or Array that contained the same object in the original will contain different objects in the deepclone.
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文