为什么Scala中foldLeft前面的过滤器很慢?

发布于 2024-10-13 17:47:40 字数 760 浏览 4 评论 0原文

我写了第一个 Project Euler 问题的答案:

将所有 1000 以下并且是 3 或 5 的倍数的自然数相加。

我想到的第一件事是:

(1 until 1000).filter(i => (i % 3 == 0 || i % 5 == 0)).foldLeft(0)(_ + _)

但是它很慢(需要 125 毫秒),所以我重写了它,只是想到“另一种方式”与“更快的方法'

(1 until 1000).foldLeft(0){
    (total, x) =>
        x match {
            case i if (i % 3 == 0 || i % 5 ==0) => i + total // Add
            case _ => total //skip
        }
}

这要快得多(仅 2 毫秒)。为什么?我猜第二个版本仅使用 Range 生成器,并且没有以任何方式体现完全实现的集合,一次完成所有操作,既更快又占用更少的内存。我说得对吗?

这里是 IdeOne 上的代码: http://ideone.com/GbKlP

I wrote an answer to the first Project Euler question:

Add all the natural numbers below one thousand that are multiples of 3 or 5.

The first thing that came to me was:

(1 until 1000).filter(i => (i % 3 == 0 || i % 5 == 0)).foldLeft(0)(_ + _)

but it's slow (it takes 125 ms), so I rewrote it, simply thinking of 'another way' versus 'the faster way'

(1 until 1000).foldLeft(0){
    (total, x) =>
        x match {
            case i if (i % 3 == 0 || i % 5 ==0) => i + total // Add
            case _ => total //skip
        }
}

This is much faster (only 2 ms). Why? I'm guess the second version uses only the Range generator and doesn't manifest a fully realized collection in any way, doing it all in one pass, both faster and with less memory. Am I right?

Here the code on IdeOne: http://ideone.com/GbKlP

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

帅哥哥的热头脑 2024-10-20 17:47:40

正如其他人所说,问题在于 filter 创建了一个新集合。替代的 withFilter 没有,但它没有 foldLeft。此外,使用 .view.iterator.toStream 都会避免以各种方式创建新集合,但它们在这里都比你使用的第一种方法,一开始我觉得有点奇怪。

但是,然后...看,1 到 1000 是一个 Range,它的大小实际上很小,因为它不存储每个元素。此外,Rangeforeach 进行了极其优化,甚至是专门化,这是其他任何集合都没有的情况。由于 foldLeft 是作为 foreach 实现的,只要您继续使用 Range,您就可以享受其优化的方法。

(_: Range).foreach:

@inline final override def foreach[@specialized(Unit) U](f: Int => U) {
    if (length > 0) {
        val last = this.last
        var i = start
        while (i != last) {
            f(i)
            i += step
        }
        f(i)
    }
}

(_: Range).view.foreach

def foreach[U](f: A => U): Unit = 
    iterator.foreach(f)

(_: Range).view.iterator

override def iterator: Iterator[A] = new Elements(0, length)

protected class Elements(start: Int, end: Int) extends BufferedIterator[A] with Serializable {
  private var i = start

  def hasNext: Boolean = i < end

  def next: A = 
    if (i < end) {
      val x = self(i)
      i += 1
      x
    } else Iterator.empty.next

  def head = 
    if (i < end) self(i) else Iterator.empty.next

  /** $super
   *  '''Note:''' `drop` is overridden to enable fast searching in the middle of indexed sequences.
   */
  override def drop(n: Int): Iterator[A] =
    if (n > 0) new Elements(i + n, end) else this

  /** $super
   *  '''Note:''' `take` is overridden to be symmetric to `drop`.
   */
  override def take(n: Int): Iterator[A] =
    if (n <= 0) Iterator.empty.buffered
    else if (i + n < end) new Elements(i, i + n) 
    else this
}

(_: Range).view.iterator.foreach

def foreach[U](f: A =>  U) { while (hasNext) f(next()) }

当然,这甚至没有计算 viewfoldLeft< 之间的 filter /代码>:

override def filter(p: A => Boolean): This = newFiltered(p).asInstanceOf[This]

protected def newFiltered(p: A => Boolean): Transformed[A] = new Filtered { val pred = p }

trait Filtered extends Transformed[A] {
  protected[this] val pred: A => Boolean 
  override def foreach[U](f: A => U) {
    for (x <- self)
      if (pred(x)) f(x)
  }
  override def stringPrefix = self.stringPrefix+"F"
}

The problem, as others have said, is that filter creates a new collection. The alternative withFilter doesn't, but that doesn't have a foldLeft. Also, using .view, .iterator or .toStream would all avoid creating the new collection in various ways, but they are all slower here than the first method you used, which I thought somewhat strange at first.

But, then... See, 1 until 1000 is a Range, whose size is actually very small, because it doesn't store each element. Also, Range's foreach is extremely optimized, and is even specialized, which is not the case of any of the other collections. Since foldLeft is implemented as a foreach, as long as you stay with a Range you get to enjoy its optimized methods.

(_: Range).foreach:

@inline final override def foreach[@specialized(Unit) U](f: Int => U) {
    if (length > 0) {
        val last = this.last
        var i = start
        while (i != last) {
            f(i)
            i += step
        }
        f(i)
    }
}

(_: Range).view.foreach

def foreach[U](f: A => U): Unit = 
    iterator.foreach(f)

(_: Range).view.iterator

override def iterator: Iterator[A] = new Elements(0, length)

protected class Elements(start: Int, end: Int) extends BufferedIterator[A] with Serializable {
  private var i = start

  def hasNext: Boolean = i < end

  def next: A = 
    if (i < end) {
      val x = self(i)
      i += 1
      x
    } else Iterator.empty.next

  def head = 
    if (i < end) self(i) else Iterator.empty.next

  /** $super
   *  '''Note:''' `drop` is overridden to enable fast searching in the middle of indexed sequences.
   */
  override def drop(n: Int): Iterator[A] =
    if (n > 0) new Elements(i + n, end) else this

  /** $super
   *  '''Note:''' `take` is overridden to be symmetric to `drop`.
   */
  override def take(n: Int): Iterator[A] =
    if (n <= 0) Iterator.empty.buffered
    else if (i + n < end) new Elements(i, i + n) 
    else this
}

(_: Range).view.iterator.foreach

def foreach[U](f: A =>  U) { while (hasNext) f(next()) }

And that, of course, doesn't even count the filter between view and foldLeft:

override def filter(p: A => Boolean): This = newFiltered(p).asInstanceOf[This]

protected def newFiltered(p: A => Boolean): Transformed[A] = new Filtered { val pred = p }

trait Filtered extends Transformed[A] {
  protected[this] val pred: A => Boolean 
  override def foreach[U](f: A => U) {
    for (x <- self)
      if (pred(x)) f(x)
  }
  override def stringPrefix = self.stringPrefix+"F"
}
无所的.畏惧 2024-10-20 17:47:40

首先尝试使集合变得惰性,所以

(1 until 1000).view.filter...

而不是

(1 until 1000).filter...

That 应该避免构建中间集合的成本。您还可能通过使用 sum 而不是 foldLeft(0)(_ + _) 获得更好的性能,某些集合类型总是可能有更有效的求和方法数字。如果没有,它仍然更干净、更具声明性......

Try making the collection lazy first, so

(1 until 1000).view.filter...

instead of

(1 until 1000).filter...

That should avoid the cost of building an intermediate collection. You might also get better performance from using sum instead of foldLeft(0)(_ + _), it's always possible that some collection type might have a more efficient way to sum numbers. If not, it's still cleaner and more declarative...

平安喜乐 2024-10-20 17:47:40

查看代码,看起来 filter 确实构建了一个新的 Seq,在该 Seq 上调用了 foldLeft 。第二个跳过了这一点。与其说内存是有帮助的,不如说是过滤后的集合根本就没有被构建。所有这些工作都还没有完成。

Range 使用 TranversableLike.filter,它看起来像这样:

def filter(p: A => Boolean): Repr = {
  val b = newBuilder
  for (x <- this) 
    if (p(x)) b += x
  b.result
}

我认为区别在于第 4 行的 +=foldLeft 中的过滤消除了它。

Looking through the code, it looks like filter does build a new Seq on which the foldLeft is called. The second skips that bit. It's not so much the memory, although that can't but help, but that the filtered collection is never built at all. All that work is never done.

Range uses TranversableLike.filter, which looks like this:

def filter(p: A => Boolean): Repr = {
  val b = newBuilder
  for (x <- this) 
    if (p(x)) b += x
  b.result
}

I think it's the += on line 4 that's the difference. Filtering in foldLeft eliminates it.

橙幽之幻 2024-10-20 17:47:40

filter 创建一个全新的序列,然后调用 foldLeft 。尝试:

(1 直到 1000).view.filter(i => (i % 3 == 0 || i % 5 == 0)).reduceLeft(_+_)

这将阻止所说的效果,只是包裹了原来的东西。将 foldLeftreduceLeft 交换只是装饰性的(在本例中)。

filter creates a whole new sequence on which then foldLeft is called. Try:

(1 until 1000).view.filter(i => (i % 3 == 0 || i % 5 == 0)).reduceLeft(_+_)

This will prevent said effect, merely wrapping the original thing. Exchanging foldLeft with reduceLeft is only cosmetic (in this case).

恋竹姑娘 2024-10-20 17:47:40

现在的挑战是,你能想出一种更有效的方法吗?在这种情况下,并不是说您的解决方案太慢,而是它的扩展能力如何?如果不是 1000,而是 1000000000 呢?有一个解决方案可以像计算前者一样快速地计算后者。

Now the challenge is, can you think of a yet more efficient way? Not that your solution is too slow in this case, but how well does it scale? What if instead of 1000, it was 1000000000? There is a solution that could compute the latter case just as quickly as the former.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文