如何优化 Scala 中的 for 理解和循环?

发布于 2024-11-10 07:08:34 字数 1522 浏览 3 评论 0原文

所以 Scala 应该和 Java 一样快。我正在重新审视 Scala 中的一些 Project Euler 问题,这些问题是我最初在 Java 中解决的。具体来说,问题 5:“能被 1 到 20 中所有数字整除的最小正数是多少?”

这是我的 Java 解决方案,在我的机器上需要 0.7 秒才能完成:

public class P005_evenly_divisible implements Runnable{
    final int t = 20;

    public void run() {
        int i = 10;
        while(!isEvenlyDivisible(i, t)){
            i += 2;
        }
        System.out.println(i);
    }

    boolean isEvenlyDivisible(int a, int b){
        for (int i = 2; i <= b; i++) {
            if (a % i != 0) 
                return false;
        }
        return true;
    }

    public static void main(String[] args) {
        new P005_evenly_divisible().run();
    }
}

这是我“直接翻译”成 Scala,需要 103 秒(长了 147 倍!)

object P005_JavaStyle {
    val t:Int = 20;
    def run {
        var i = 10
        while(!isEvenlyDivisible(i,t))
            i += 2
        println(i)
    }
    def isEvenlyDivisible(a:Int, b:Int):Boolean = {
        for (i <- 2 to b)
            if (a % i != 0)
                return false
        return true
    }
    def main(args : Array[String]) {
        run
    }
}

最后这是我在函数式编程方面的尝试,需要 39 秒(长了 55 倍) )

object P005 extends App{
    def isDivis(x:Int) = (1 to 20) forall {x % _ == 0}
    def find(n:Int):Int = if (isDivis(n)) n else find (n+2)
    println (find (2))
}

在 Windows 7 64 位上使用 Scala 2.9.0.1。如何提高性能?我做错了什么吗?或者 Java 只是速度快很多?

So Scala is supposed to be as fast as Java. I'm revisiting some Project Euler problems in Scala that I originally tackled in Java. Specifically Problem 5: "What is the smallest positive number that is evenly divisible by all of the numbers from 1 to 20?"

Here's my Java solution, which takes 0.7 seconds to complete on my machine:

public class P005_evenly_divisible implements Runnable{
    final int t = 20;

    public void run() {
        int i = 10;
        while(!isEvenlyDivisible(i, t)){
            i += 2;
        }
        System.out.println(i);
    }

    boolean isEvenlyDivisible(int a, int b){
        for (int i = 2; i <= b; i++) {
            if (a % i != 0) 
                return false;
        }
        return true;
    }

    public static void main(String[] args) {
        new P005_evenly_divisible().run();
    }
}

Here's my "direct translation" into Scala, which takes 103 seconds (147 times longer!)

object P005_JavaStyle {
    val t:Int = 20;
    def run {
        var i = 10
        while(!isEvenlyDivisible(i,t))
            i += 2
        println(i)
    }
    def isEvenlyDivisible(a:Int, b:Int):Boolean = {
        for (i <- 2 to b)
            if (a % i != 0)
                return false
        return true
    }
    def main(args : Array[String]) {
        run
    }
}

Finally here's my attempt at functional programming, which takes 39 seconds (55 times longer)

object P005 extends App{
    def isDivis(x:Int) = (1 to 20) forall {x % _ == 0}
    def find(n:Int):Int = if (isDivis(n)) n else find (n+2)
    println (find (2))
}

Using Scala 2.9.0.1 on Windows 7 64-bit. How do I improve performance? Am I doing something wrong? Or is Java just a lot faster?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(8

2024-11-17 07:08:34

这种特殊情况下的问题是您从 for 表达式中返回。这反过来会被转换为 NonLocalReturnException 的抛出,并在封闭方法中捕获。优化器可以消除 foreach,但还不能消除 throw/catch。而且投/接的成本很高。但由于此类嵌套返回在 Scala 程序中很少见,因此优化器尚未解决这种情况。目前正在进行改进优化器的工作,希望很快就能解决这个问题。

The problem in this particular case is that you return from within the for-expression. That in turn gets translated into a throw of a NonLocalReturnException, which is caught at the enclosing method. The optimizer can eliminate the foreach but cannot yet eliminate the throw/catch. And throw/catch is expensive. But since such nested returns are rare in Scala programs, the optimizer did not yet address this case. There is work going on to improve the optimizer which hopefully will solve this issue soon.

素罗衫 2024-11-17 07:08:34

问题很可能是在方法 isEvenlyDivisible 中使用了 for 理解。将 for 替换为等效的 while 循环应该可以消除与 Java 的性能差异。

与 Java 的 for 循环相反,Scala 的 for 推导式实际上是高阶方法的语法糖;在本例中,您将调用 Range 对象的 foreach 方法。 Scala 的 for 非常通用,但有时会导致性能不佳。

您可能想尝试 Scala 2.9 版中的 -optimize 标志。观察到的性能可能取决于所使用的特定 JVM,以及 JIT 优化器是否有足够的“预热”时间来识别和优化热点。

邮件列表上最近的讨论表明 Scala 团队正在致力于提高简单情况下的性能:

这是错误跟踪器中的问题:
https://issues.scala-lang.org/browse/SI-4633

更新 5 /28

  • 作为短期解决方案,ScalaCL 插件 (alpha) 会将简单的 Scala 循环转换为相当于 while 循环。
  • 作为一个潜在的长期解决方案,来自 EPFL 和斯坦福大学的团队正在合作开展一个项目,实现< a href="https://github.com/TiarkRompf/scala-virtualized">“虚拟”Scala 以获得非常高的性能。例如,多个惯用的函数循环可以在运行时融合到最佳的 JVM 字节码中,或者到另一个目标,例如 GPU。该系统是可扩展的,允许用户定义 DSL 和转换。查看出版物和斯坦福课程笔记。初步代码可在 Github 上获取,预计在未来几个月内发布。

The problem is most likely the use of a for comprehension in the method isEvenlyDivisible. Replacing for by an equivalent while loop should eliminate the performance difference with Java.

As opposed to Java's for loops, Scala's for comprehensions are actually syntactic sugar for higher-order methods; in this case, you're calling the foreach method on a Range object. Scala's for is very general, but sometimes leads to painful performance.

You might want to try the -optimize flag in Scala version 2.9. Observed performance may depend on the particular JVM in use, and the JIT optimizer having sufficient "warm up" time to identify and optimize hot-spots.

Recent discussions on the mailing list indicate that the Scala team is working on improving for performance in simple cases:

Here is the issue in the bug tracker:
https://issues.scala-lang.org/browse/SI-4633

Update 5/28:

  • As a short term solution, the ScalaCL plugin (alpha) will transform simple Scala loops into the equivalent of while loops.
  • As a potential longer term solution, teams from the EPFL and Stanford are collaborating on a project enabling run-time compilation of "virtual" Scala for very high performance. For example, multiple idiomatic functional loops can be fused at run-time into optimal JVM bytecode, or to another target such as a GPU. The system is extensible, allowing user defined DSLs and transformations. Check out the publications and Stanford course notes. Preliminary code is available on Github, with a release intended in the coming months.
北凤男飞 2024-11-17 07:08:34

作为后续,我尝试了 -optimize 标志,它将运行时间从 103 秒减少到 76 秒,但这仍然比 Java 或 while 循环慢 107 倍。

然后我正在研究“功能”版本:

object P005 extends App{
  def isDivis(x:Int) = (1 to 20) forall {x % _ == 0}
  def find(n:Int):Int = if (isDivis(n)) n else find (n+2)
  println (find (2))
}

并试图找出如何以简洁的方式摆脱“forall”。我惨遭失败,并想出了

object P005_V2 extends App {
  def isDivis(x:Int):Boolean = {
    var i = 1
    while(i <= 20) {
      if (x % i != 0) return false
      i += 1
    }
    return true
  }
  def find(n:Int):Int = if (isDivis(n)) n else find (n+2)
  println (find (2))
}

我狡猾的 5 行解决方案已膨胀到 12 行。然而,这个版本的运行时间为0.71秒,与原始Java版本的速度相同,比上面使用“forall”的版本(40.2秒)快了56倍! (请参阅下面的编辑,了解为什么这比 Java 更快)

显然,我的下一步是将上面的内容翻译回 Java,但 Java 无法处理它并抛出一个 StackOverflowError,其中 n 约为 22000 标记。

然后我挠了挠头,用更多的尾递归替换了“while”,这节省了几行,运行速度也一样快,但让我们面对现实吧,读起来更混乱:

object P005_V3 extends App {
  def isDivis(x:Int, i:Int):Boolean = 
    if(i > 20) true
    else if(x % i != 0) false
    else isDivis(x, i+1)

  def find(n:Int):Int = if (isDivis(n, 2)) n else find (n+2)
  println (find (2))
}

所以 Scala 的尾递归赢得了但令我惊讶的是,像“for”循环(和“forall”方法)这样简单的东西本质上已经被破坏了,必须被不优雅和冗长的“while”或尾递归所取代。我尝试 Scala 的很多原因是因为它的语法简洁,但如果我的代码运行速度慢 100 倍那就不好了!

编辑:(已删除)

编辑编辑:以前 2.5 秒和 0.7 秒的运行时间之间的差异完全是由于正在运行的 32 位或 64 位 JVM 造成的。用过的。命令行中的 Scala 使用 JAVA_HOME 设置的任何内容,而 Java 则使用 64 位(如果可用)。 IDE 有自己的设置。这里有一些测量:Eclipse 中的 Scala 执行时间

As a follow-up, I tried the -optimize flag and it reduced running time from 103 to 76 seconds, but that's still 107x slower than Java or a while loop.

Then I was looking at the "functional" version:

object P005 extends App{
  def isDivis(x:Int) = (1 to 20) forall {x % _ == 0}
  def find(n:Int):Int = if (isDivis(n)) n else find (n+2)
  println (find (2))
}

and trying to figure out how to get rid of the "forall" in a concise manner. I failed miserably and came up with

object P005_V2 extends App {
  def isDivis(x:Int):Boolean = {
    var i = 1
    while(i <= 20) {
      if (x % i != 0) return false
      i += 1
    }
    return true
  }
  def find(n:Int):Int = if (isDivis(n)) n else find (n+2)
  println (find (2))
}

whereby my cunning 5-line solution has balooned to 12 lines. However, this version runs in 0.71 seconds, the same speed as the original Java version, and 56 times faster than the version above using "forall" (40.2 s)! (see EDIT below for why this is faster than Java)

Obviously my next step was to translate the above back into Java, but Java can't handle it and throws a StackOverflowError with n around the 22000 mark.

I then scratched my head for a bit and replaced the "while" with a bit more tail recursion, which saves a couple of lines, runs just as fast, but let's face it, is more confusing to read:

object P005_V3 extends App {
  def isDivis(x:Int, i:Int):Boolean = 
    if(i > 20) true
    else if(x % i != 0) false
    else isDivis(x, i+1)

  def find(n:Int):Int = if (isDivis(n, 2)) n else find (n+2)
  println (find (2))
}

So Scala's tail recursion wins the day, but I'm surprised that something as simple as a "for" loop (and the "forall" method) is essentially broken and has to be replaced by inelegant and verbose "whiles", or tail recursion. A lot of the reason I'm trying Scala is because of the concise syntax, but it's no good if my code is going to run 100 times slower!

EDIT: (deleted)

EDIT OF EDIT: Former discrepancies between run times of 2.5s and 0.7s were entirely due to whether the 32-bit or 64-bit JVMs were being used. Scala from the command line uses whatever is set by JAVA_HOME, while Java uses 64-bit if available regardless. IDEs have their own settings. Some measurements here: Scala execution times in Eclipse

怕倦 2024-11-17 07:08:34

关于理解的答案是正确的,但这不是全部。您应该注意,在 isEvenlyDivisible 中使用 return 并不是免费的。在 for 中使用 return 会强制 scala 编译器生成非本地返回(即在其函数之外返回)。

这是通过使用异常退出循环来完成的。如果您构建自己的控制抽象,也会发生同样的情况,例如:

def loop[T](times: Int, default: T)(body: ()=>T) : T = {
    var count = 0
    var result: T = default
    while(count < times) {
        result = body()
        count += 1
    }
    result
}

def foo() : Int= {
    loop(5, 0) {
        println("Hi")
        return 5
    }
}

foo()

这仅打印一次“Hi”。

请注意,foo 中的return 退出foo(这正是您所期望的)。由于括号中的表达式是一个函数文字,您可以在 loop 的签名中看到,这会强制编译器生成非本地返回,即 return 强制您退出foo,而不仅仅是body

在Java(即JVM)中,实现这种行为的唯一方法是抛出异常。

回到 isEvenlyDivisible:

def isEvenlyDivisible(a:Int, b:Int):Boolean = {
  for (i <- 2 to b) 
    if (a % i != 0) return false
  return true
}

if (a % i != 0) return false 是一个有返回值的函数文字,因此每次返回值被命中时,运行时必须抛出并捕获异常,这会导致相当多的 GC 开销。

The answer about for comprehension is right, but it's not the whole story. You should note note that the use of return in isEvenlyDivisible is not free. The use of return inside the for, forces the scala compiler to generate a non-local return (i.e. to return outside it's function).

This is done through the use of an exception to exit the loop. The same happens if you build your own control abstractions, for example:

def loop[T](times: Int, default: T)(body: ()=>T) : T = {
    var count = 0
    var result: T = default
    while(count < times) {
        result = body()
        count += 1
    }
    result
}

def foo() : Int= {
    loop(5, 0) {
        println("Hi")
        return 5
    }
}

foo()

This prints "Hi" only once.

Note that the return in foo exits foo (which is what you would expect). Since the bracketed expression is a function literal, which you can see in the signature of loop this forces the compiler to generate a non local return, that is, the return forces you to exit foo, not just the body.

In Java (i.e. the JVM) the only way to implement such behavior is to throw an exception.

Going back to isEvenlyDivisible:

def isEvenlyDivisible(a:Int, b:Int):Boolean = {
  for (i <- 2 to b) 
    if (a % i != 0) return false
  return true
}

The if (a % i != 0) return false is a function literal that has a return, so each time the return is hit, the runtime has to throw and catch an exception, which causes quite a bit of GC overhead.

顾冷 2024-11-17 07:08:34

我发现的一些加速 forall 方法的方法:

原始:41.3 s

def isDivis(x:Int) = (1 to 20) forall {x % _ == 0}

预实例化范围,这样我们就不会每次都创建一个新范围: >9.0 s

val r = (1 to 20)
def isDivis(x:Int) = r forall {x % _ == 0}

转换为列表而不是范围:4.8 s

val rl = (1 to 20).toList
def isDivis(x:Int) = rl forall {x % _ == 0}

我尝试了一些其他集合,但列表是最快的(尽管仍然比我们避免范围和高阶集合慢 7 倍)功能总共)。

虽然我是 Scala 的新手,但我猜想编译器可以通过简单地自动将方法中的 Range 文字(如上所述)替换为最外层作用域中的 Range 常量,轻松实现快速且显着的性能提升。或者更好的是,将它们像 Java 中的字符串文字一样进行实习。


脚注
数组与 Range 大致相同,但有趣的是,使用新的 forall 方法(如下所示)可以使 64 位上的执行速度提高 24%,在 32 位上提高 8%。当我通过将因子数量从 20 减少到 15 来减少计算量时,差异消失了,所以这可能是垃圾收集效应。无论是什么原因,当长时间满负荷运行时,这一点都很重要。

List 的类似皮条客也使性能提高了约 10%。

  val ra = (1 to 20).toArray
  def isDivis(x:Int) = ra forall2 {x % _ == 0}

  case class PimpedSeq[A](s: IndexedSeq[A]) {
    def forall2 (p: A => Boolean): Boolean = {      
      var i = 0
      while (i < s.length) {
        if (!p(s(i))) return false
        i += 1
      }
      true
    }    
  }  
  implicit def arrayToPimpedSeq[A](in: Array[A]): PimpedSeq[A] = PimpedSeq(in)  

Some ways to speed up the forall method I discovered:

The original: 41.3 s

def isDivis(x:Int) = (1 to 20) forall {x % _ == 0}

Pre-instantiating the range, so we don't create a new range every time: 9.0 s

val r = (1 to 20)
def isDivis(x:Int) = r forall {x % _ == 0}

Converting to a List instead of a Range: 4.8 s

val rl = (1 to 20).toList
def isDivis(x:Int) = rl forall {x % _ == 0}

I tried a few other collections but List was fastest (although still 7x slower than if we avoid the Range and higher-order function altogether).

While I am new to Scala, I'd guess the compiler could easily implement a quick and significant performance gain by simply automatically replacing Range literals in methods (as above) with Range constants in the outermost scope. Or better, intern them like Strings literals in Java.


footnote:
Arrays were about the same as Range, but interestingly, pimping a new forall method (shown below) resulted in 24% faster execution on 64-bit, and 8% faster on 32-bit. When I reduced the calculation size by reducing the number of factors from 20 to 15 the difference disappeared, so maybe it's a garbage collection effect. Whatever the cause, it's significant when operating under full load for extended periods.

A similar pimp for List also resulted in about 10% better performance.

  val ra = (1 to 20).toArray
  def isDivis(x:Int) = ra forall2 {x % _ == 0}

  case class PimpedSeq[A](s: IndexedSeq[A]) {
    def forall2 (p: A => Boolean): Boolean = {      
      var i = 0
      while (i < s.length) {
        if (!p(s(i))) return false
        i += 1
      }
      true
    }    
  }  
  implicit def arrayToPimpedSeq[A](in: Array[A]): PimpedSeq[A] = PimpedSeq(in)  
百合的盛世恋 2024-11-17 07:08:34

我只是想对那些可能因此类问题而对 Scala 失去信心的人发表评论,因为几乎所有函数式语言的性能都会出现此类问题。如果你在 Haskell 中优化折叠,你通常必须将其重写为递归尾部调用优化循环,否则你将遇到性能和内存问题。

我知道不幸的是 FP 还没有优化到我们不必考虑这样的事情,但这根本不是 Scala 特有的问题。

I just wanted to comment for people who might lose faith in Scala over issues like this that these kinds of issues come up in the performance of just about all functional languages. If you are optimizing a fold in Haskell, you will often have to re-write it as a recursive tail-call-optimized loop, or else you'll have performance and memory issues to contend with.

I know it's unfortunate that FPs aren't yet optimized to the point where we don't have to think about things like this, but this is not at all a problem particular to Scala.

溺深海 2024-11-17 07:08:34

Scala 特有的问题已经讨论过,但主要问题是使用暴力算法并不是很酷。考虑一下(比原始 Java 代码快得多):

def gcd(a: Int, b: Int): Int = {
    if (a == 0)
        b
    else
        gcd(b % a, a)
}
print (1 to 20 reduce ((a, b) => {
  a / gcd(a, b) * b
}))

Problems specific to Scala have already been discussed, but the main problem is that using a brute-force algorithm is not very cool. Consider this (much faster than the original Java code):

def gcd(a: Int, b: Int): Int = {
    if (a == 0)
        b
    else
        gcd(b % a, a)
}
print (1 to 20 reduce ((a, b) => {
  a / gcd(a, b) * b
}))
浅忆流年 2024-11-17 07:08:34

尝试解决方案中给出的单行代码Scala for Project Euler

给出的时间至少比你的快,尽管远离 while 循环..:)

Try the one-liner given in the solution Scala for Project Euler

The time given is at least faster than yours, though far from the while loop.. :)

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文