Ruby - 优雅地比较两个枚举器

发布于 2024-11-17 09:23:38 字数 807 浏览 2 评论 0原文

我有来自 Ruby (1.9.2) 中两个不同来源(二进制数据)的两个长数字流。

这两个源以两个枚举器的形式封装。

我想检查两个流是否完全相等。

我提出了几个解决方案,但两者看起来都很不优雅。

第一个只是将两者转换为数组:

def equal_streams?(s1, s2)
  s1.to_a == s2.to_a
end

这可行,但在内存方面性能不是很好,特别是当流有大量信息时。

另一个选择是......呃。

def equal_streams?(s1, s2)
  s1.each do |e1|
    begin
      e2 = s2.next
      return false unless e1 == e2 # Different element found
    rescue StopIteration
      return false # s2 has run out of items before s1
    end
  end

  begin
    s2.next
  rescue StopIteration
    # s1 and s2 have run out of elements at the same time; they are equal
    return true
  end

  return false

end

那么,有没有一种更简单、更优雅的方法来做到这一点呢?

I've got two long streams of numbers coming from two different sources (binary data) in Ruby (1.9.2).

The two sources are encapsulated in the form of two Enumerators.

I want to check that the two streams are exactly equal.

I've come with a couple solutions, but both seem quite inelegant.

The first one simply transforms both into an array:

def equal_streams?(s1, s2)
  s1.to_a == s2.to_a
end

This works, but it is not very performant, memory-wise, specially if the streams have lots of information.

The other option is... ugh.

def equal_streams?(s1, s2)
  s1.each do |e1|
    begin
      e2 = s2.next
      return false unless e1 == e2 # Different element found
    rescue StopIteration
      return false # s2 has run out of items before s1
    end
  end

  begin
    s2.next
  rescue StopIteration
    # s1 and s2 have run out of elements at the same time; they are equal
    return true
  end

  return false

end

So, is there a simpler, more elegant way of doing this?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

空宴 2024-11-24 09:23:39

只需对代码进行轻微重构,假设您的流不包含元素 :eof

def equal_streams?(s1, s2)
  loop do
    e1 = s1.next rescue :eof
    e2 = s2.next rescue :eof
    return false unless e1 == e2
    return true if e1 == :eof
  end
end

使用像 loop 这样的关键字应该比使用像 each 这样的方法更快。

Just a slight refactoring to your code, assuming that your streams do not include an element :eof.

def equal_streams?(s1, s2)
  loop do
    e1 = s1.next rescue :eof
    e2 = s2.next rescue :eof
    return false unless e1 == e2
    return true if e1 == :eof
  end
end

Using a keyword like loop should be faster than using a method like each.

温柔一刀 2024-11-24 09:23:39

一次比较一个元素可能是您能做的最好的事情,但您可以做得比您的“呃”解决方案更好:

def grab_next(h, k, s)
  h[k] = s.next
rescue StopIteration
end

def equal_streams?(s1, s2)
  loop do
    vals = { }
    grab_next(vals, :s1, s1)
    grab_next(vals, :s2, s2)
    return true  if(vals.keys.length == 0)  # Both of them ran out.
    return false if(vals.keys.length == 1)  # One of them ran out early.
    return false if(vals[:s1] != vals[:s2]) # Found a mismatch.
  end
end

棘手的部分是区分仅一个流耗尽和两个流都耗尽。将 StopIteration 异常推送到单独的函数中并使用散列中缺少键的情况是一种相当方便的方法。如果您的流包含 falsenil,仅检查 vals[:s1] 就会导致问题,但检查密钥是否存在可以解决该问题。

Comparing them one element at a time is probably the best you're going to be able to do but you can do it nicer than your "ugh" solution:

def grab_next(h, k, s)
  h[k] = s.next
rescue StopIteration
end

def equal_streams?(s1, s2)
  loop do
    vals = { }
    grab_next(vals, :s1, s1)
    grab_next(vals, :s2, s2)
    return true  if(vals.keys.length == 0)  # Both of them ran out.
    return false if(vals.keys.length == 1)  # One of them ran out early.
    return false if(vals[:s1] != vals[:s2]) # Found a mismatch.
  end
end

The tricky part is differentiating between just one stream running out and both running out. Pushing the StopIteration exception into a separate function and using the absence of a key in a hash is a fairly convenient way of doing that. Just checking vals[:s1] will cause problems if your stream contains false or nil but checking for the presence of a key solves that problem.

咆哮 2024-11-24 09:23:39

下面是通过为 Enumerable#zip 创建替代方案来实现此目的的一个例子,该替代方案工作缓慢且不会创建整个数组。它结合了 Closure 的 interleave我的实现 和其他两个答案(使用哨兵值来指示已到达Enumerable的末尾 - 导致问题的事实是next倒带Enumerable 一旦到达末尾)。

该解决方案支持多个参数,因此您可以一次比较n个结构。

module Enumerable
  # this should be just a unique sentinel value (any ideas for more elegant solution?)
  END_REACHED = Object.new

  def lazy_zip *others
    sources = ([self] + others).map(&:to_enum)
    Enumerator.new do |yielder|
      loop do
        sources, values = sources.map{|s|
          [s, s.next] rescue [nil, END_REACHED]
        }.transpose
        raise StopIteration if values.all?{|v| v == END_REACHED}
        yielder.yield values.map{|v| v == END_REACHED ? nil : v}
      end
    end
  end
end

因此,当您有 zip 的变体时,它会延迟工作并且在第一个可枚举到达末尾时不会停止迭代,您可以使用 all?any? 实际检查相应元素是否相等。

# zip would fail here, as it would return just [[1,1],[2,2],[3,3]]:
p [1,2,3].lazy_zip([1,2,3,4]).all?{|l,r| l == r}
#=> false

# this is ok
p [1,2,3,4].lazy_zip([1,2,3,4]).all?{|l,r| l == r}
#=> true

# comparing more than two input streams:
p [1,2,3,4].lazy_zip([1,2,3,4],[1,2,3]).all?{|vals|
  # check for equality by checking length of the uniqued array
  vals.uniq.length == 1
}
#=> false

Here's a shot of doing it by creating an alternative for Enumerable#zip, which works lazily and doesn't create an entire array. It's combining my implementation of Closure's interleave and other two answers here (using sentinel value to indicate end of the Enumerable has been reached - the fact causing the problem is that next rewinds the Enumerable once it reached the end).

This solution supports multiple parameters, so you can compare n structures at once.

module Enumerable
  # this should be just a unique sentinel value (any ideas for more elegant solution?)
  END_REACHED = Object.new

  def lazy_zip *others
    sources = ([self] + others).map(&:to_enum)
    Enumerator.new do |yielder|
      loop do
        sources, values = sources.map{|s|
          [s, s.next] rescue [nil, END_REACHED]
        }.transpose
        raise StopIteration if values.all?{|v| v == END_REACHED}
        yielder.yield values.map{|v| v == END_REACHED ? nil : v}
      end
    end
  end
end

So, when you have variant of zip which works lazily and doesn't stop iteration when the first enumerable reaches the end, you can use all? or any? to actually check corresponding elements for equality.

# zip would fail here, as it would return just [[1,1],[2,2],[3,3]]:
p [1,2,3].lazy_zip([1,2,3,4]).all?{|l,r| l == r}
#=> false

# this is ok
p [1,2,3,4].lazy_zip([1,2,3,4]).all?{|l,r| l == r}
#=> true

# comparing more than two input streams:
p [1,2,3,4].lazy_zip([1,2,3,4],[1,2,3]).all?{|vals|
  # check for equality by checking length of the uniqued array
  vals.uniq.length == 1
}
#=> false
凡尘雨 2024-11-24 09:23:39

根据评论中的讨论,这里是基于 zip 的解决方案,首先将 zip 的块版本包装在 Enumerator 中,然后使用它来比较相应的元素。

它可以工作,但已经提到了边缘情况:如果第一个流比另一个短,则另一个流的剩余元素将被丢弃(请参见下面的示例)。

我已将此答案标记为社区维基,以便其他成员可以对其进行改进。

def zip_lazy *enums
  Enumerator.new do |yielder|
    head, *tail = enums
    head.zip(*tail) do |values|
      yielder.yield values
    end
  end
end

p zip_lazy(1..3, 1..4).all?{|l,r| l == r}
#=> true
p zip_lazy(1..3, 1..3).all?{|l,r| l == r}
#=> true
p zip_lazy(1..4, 1..3).all?{|l,r| l == r}
#=> false

Following the discussion in the comments, here is zip-based solution, first wrapping block version of zip within an Enumerator, then using it to compare corresponding elements.

It works, but there is already mentioned edge case: if the first stream is shorter than the other, remaining elements from the other will be discarded (see the example below).

I have marked this answer as community wiki as other members could improve on it.

def zip_lazy *enums
  Enumerator.new do |yielder|
    head, *tail = enums
    head.zip(*tail) do |values|
      yielder.yield values
    end
  end
end

p zip_lazy(1..3, 1..4).all?{|l,r| l == r}
#=> true
p zip_lazy(1..3, 1..3).all?{|l,r| l == r}
#=> true
p zip_lazy(1..4, 1..3).all?{|l,r| l == r}
#=> false
扭转时空 2024-11-24 09:23:39

这是一个使用纤程/协同例程的 2 源示例。虽然有点啰嗦,但是它的行为非常明确,这很好。

def zip_verbose(enum1, enum2)
  e2_fiber = Fiber.new do
    enum2.each{|e2| Fiber.yield true, e2 }
    Fiber.yield false, nil
  end
  e2_has_value, e2_val = true, nil
  enum1.each do |e1_val|
    e2_has_value, e2_val = e2_fiber.resume if e2_has_value
    yield [true, e1_val], [e2_has_value, e2_val]
  end
  return unless e2_has_value
  loop do
    e2_has_value, e2_val = e2_fiber.resume
    break unless e2_has_value
    yield [false, nil], [e2_has_value, e2_val]
  end
end

def zip(enum1, enum2)
  zip_verbose(enum1, enum2) {|e1, e2| yield e1[1], e2[1] }
end

def self.equal?(enum1, enum2)
  zip_verbose(enum1, enum2) do |e1,e2|
    return false unless e1 == e2
  end
  return true
end

Here is a 2-source example using a fiber/co-routine. It's a bit long-winded, but it's very explicit about its behavior, which is nice.

def zip_verbose(enum1, enum2)
  e2_fiber = Fiber.new do
    enum2.each{|e2| Fiber.yield true, e2 }
    Fiber.yield false, nil
  end
  e2_has_value, e2_val = true, nil
  enum1.each do |e1_val|
    e2_has_value, e2_val = e2_fiber.resume if e2_has_value
    yield [true, e1_val], [e2_has_value, e2_val]
  end
  return unless e2_has_value
  loop do
    e2_has_value, e2_val = e2_fiber.resume
    break unless e2_has_value
    yield [false, nil], [e2_has_value, e2_val]
  end
end

def zip(enum1, enum2)
  zip_verbose(enum1, enum2) {|e1, e2| yield e1[1], e2[1] }
end

def self.equal?(enum1, enum2)
  zip_verbose(enum1, enum2) do |e1,e2|
    return false unless e1 == e2
  end
  return true
end
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文