Ruby的range step方法导致执行速度非常慢?

发布于 2024-10-21 12:37:26 字数 491 浏览 8 评论 0原文

我有这样的代码块:

date_counter = Time.mktime(2011,01,01,00,00,00,"+05:00")
@weeks = Array.new
(date_counter..Time.now).step(1.week) do |week|
   logger.debug "WEEK: " + week.inspect
   @weeks << week
end

从技术上讲,代码可以工作,输出:

Sat Jan 01 00:00:00 -0500 2011
Sat Jan 08 00:00:00 -0500 2011
Sat Jan 15 00:00:00 -0500 2011
etc.

但执行时间完全是垃圾!每周计算大约需要四秒钟。

我在这段代码中遗漏了一些奇怪的低效率吗?看起来很简单。

我正在运行 Ruby 1.8.7 和 Rails 3.0.3。

I've got this block of code:

date_counter = Time.mktime(2011,01,01,00,00,00,"+05:00")
@weeks = Array.new
(date_counter..Time.now).step(1.week) do |week|
   logger.debug "WEEK: " + week.inspect
   @weeks << week
end

Technically, the code works, outputting:

Sat Jan 01 00:00:00 -0500 2011
Sat Jan 08 00:00:00 -0500 2011
Sat Jan 15 00:00:00 -0500 2011
etc.

But the execution time is complete rubbish! It takes approximately four seconds to compute each week.

Is there some grotesque inefficiency that I'm missing in this code? It seems straight-forward enough.

I'm running Ruby 1.8.7 with Rails 3.0.3.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

错爱 2024-10-28 12:37:27

假设 MRI 和 Rubinius 使用类似的方法来生成范围,则删除所有无关检查和一些 Fixnum 优化等的基本算法是:(

class Range
  def each(&block)
    current = @first
    while current < @last
      yield current
      current = current.succ
    end
  end

  def step(step_size, &block)
    counter = 0
    each do |o|
      yield o if counter % step_size = 0
      counter += 1
    end
  end
end

参见 Rubinius 源代码)

对于 Time 对象 #succ 一秒后返回时间。因此,即使您只要求每周一次,它也必须在两次之间逐步执行。

编辑:解决方案

构建一系列 Fixnum,因为它们具有优化的 Range#step 实现。
像这样的东西:

date_counter = Time.mktime(2011,01,01,00,00,00,"+05:00")
@weeks = Array.new

(date_counter.to_i..Time.now.to_i).step(1.week).map do |time|
  Time.at(time)
end.each do |week|
  logger.debug "WEEK: " + week.inspect
  @weeks << week
end

Assuming MRI and Rubinius use similar methods to generate the range the basic algorithm used with all the extraneous checks and a few Fixnum optimisations etc. removed is:

class Range
  def each(&block)
    current = @first
    while current < @last
      yield current
      current = current.succ
    end
  end

  def step(step_size, &block)
    counter = 0
    each do |o|
      yield o if counter % step_size = 0
      counter += 1
    end
  end
end

(See the Rubinius source code)

For a Time object #succ returns the time one second later. So even though you are asking it for just each week it has to step through every second between the two times anyway.

Edit: Solution

Build a range of Fixnum's since they have an optimised Range#step implementation.
Something like:

date_counter = Time.mktime(2011,01,01,00,00,00,"+05:00")
@weeks = Array.new

(date_counter.to_i..Time.now.to_i).step(1.week).map do |time|
  Time.at(time)
end.each do |week|
  logger.debug "WEEK: " + week.inspect
  @weeks << week
end
夜唯美灬不弃 2024-10-28 12:37:27

是的,你错过了严重的低效率问题。在 irb 中试试这个,看看你在做什么:

(Time.mktime(2011,01,01,00,00,00,"+05:00") .. Time.now).each { |x| puts x }

范围运算符从 1 月 1 日到现在,以一秒为增量,这是一个巨大的列表。不幸的是,Ruby 不够聪明,无法将范围生成和一周分块合并到单个操作中,因此它必须构建整个约 600 万个条目列表。

顺便说一句,“直接”和“严重低效”并不相互排斥,事实上它们通常是并发条件。

更新:如果您这样做:

(0 .. 6000000).step(7*24*3600) { |x| puts x }

那么输出几乎是立即产生的。因此,问题似乎在于 Range 不知道在面对一系列 Time 对象时如何优化分块,但它可以使用 Fixnum 范围很好地解决问题。

Yes, you are missing a gross inefficiency. Try this in irb to see what you're doing:

(Time.mktime(2011,01,01,00,00,00,"+05:00") .. Time.now).each { |x| puts x }

The range operator is going from January 1 to now in increments of one second and that's a huge list. Unfortunately, Ruby isn't clever enough to combine the range generation and the one-week chunking into a single operation so it has to build the entire ~6million entry list.

BTW, "straight forward" and "gross inefficiency" are not mutually exclusive, in fact they're often concurrent conditions.

UPDATE: If you do this:

(0 .. 6000000).step(7*24*3600) { |x| puts x }

Then the output is produced almost instantaneously. So, it appears that the problem is that Range doesn't know how to optimize the chunking when faced with a range of Time objects but it can figure things out quite nicely with Fixnum ranges.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文