如何优化 Scala 中的 for 理解和循环?
所以 Scala 应该和 Java 一样快。我正在重新审视 Scala 中的一些 Project Euler 问题,这些问题是我最初在 Java 中解决的。具体来说,问题 5:“能被 1 到 20 中所有数字整除的最小正数是多少?”
这是我的 Java 解决方案,在我的机器上需要 0.7 秒才能完成:
public class P005_evenly_divisible implements Runnable{
final int t = 20;
public void run() {
int i = 10;
while(!isEvenlyDivisible(i, t)){
i += 2;
}
System.out.println(i);
}
boolean isEvenlyDivisible(int a, int b){
for (int i = 2; i <= b; i++) {
if (a % i != 0)
return false;
}
return true;
}
public static void main(String[] args) {
new P005_evenly_divisible().run();
}
}
这是我“直接翻译”成 Scala,需要 103 秒(长了 147 倍!)
object P005_JavaStyle {
val t:Int = 20;
def run {
var i = 10
while(!isEvenlyDivisible(i,t))
i += 2
println(i)
}
def isEvenlyDivisible(a:Int, b:Int):Boolean = {
for (i <- 2 to b)
if (a % i != 0)
return false
return true
}
def main(args : Array[String]) {
run
}
}
最后这是我在函数式编程方面的尝试,需要 39 秒(长了 55 倍) )
object P005 extends App{
def isDivis(x:Int) = (1 to 20) forall {x % _ == 0}
def find(n:Int):Int = if (isDivis(n)) n else find (n+2)
println (find (2))
}
在 Windows 7 64 位上使用 Scala 2.9.0.1。如何提高性能?我做错了什么吗?或者 Java 只是速度快很多?
So Scala is supposed to be as fast as Java. I'm revisiting some Project Euler problems in Scala that I originally tackled in Java. Specifically Problem 5: "What is the smallest positive number that is evenly divisible by all of the numbers from 1 to 20?"
Here's my Java solution, which takes 0.7 seconds to complete on my machine:
public class P005_evenly_divisible implements Runnable{
final int t = 20;
public void run() {
int i = 10;
while(!isEvenlyDivisible(i, t)){
i += 2;
}
System.out.println(i);
}
boolean isEvenlyDivisible(int a, int b){
for (int i = 2; i <= b; i++) {
if (a % i != 0)
return false;
}
return true;
}
public static void main(String[] args) {
new P005_evenly_divisible().run();
}
}
Here's my "direct translation" into Scala, which takes 103 seconds (147 times longer!)
object P005_JavaStyle {
val t:Int = 20;
def run {
var i = 10
while(!isEvenlyDivisible(i,t))
i += 2
println(i)
}
def isEvenlyDivisible(a:Int, b:Int):Boolean = {
for (i <- 2 to b)
if (a % i != 0)
return false
return true
}
def main(args : Array[String]) {
run
}
}
Finally here's my attempt at functional programming, which takes 39 seconds (55 times longer)
object P005 extends App{
def isDivis(x:Int) = (1 to 20) forall {x % _ == 0}
def find(n:Int):Int = if (isDivis(n)) n else find (n+2)
println (find (2))
}
Using Scala 2.9.0.1 on Windows 7 64-bit. How do I improve performance? Am I doing something wrong? Or is Java just a lot faster?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(8)
这种特殊情况下的问题是您从 for 表达式中返回。这反过来会被转换为 NonLocalReturnException 的抛出,并在封闭方法中捕获。优化器可以消除 foreach,但还不能消除 throw/catch。而且投/接的成本很高。但由于此类嵌套返回在 Scala 程序中很少见,因此优化器尚未解决这种情况。目前正在进行改进优化器的工作,希望很快就能解决这个问题。
The problem in this particular case is that you return from within the for-expression. That in turn gets translated into a throw of a NonLocalReturnException, which is caught at the enclosing method. The optimizer can eliminate the foreach but cannot yet eliminate the throw/catch. And throw/catch is expensive. But since such nested returns are rare in Scala programs, the optimizer did not yet address this case. There is work going on to improve the optimizer which hopefully will solve this issue soon.
问题很可能是在方法
isEvenlyDivisible
中使用了for
理解。将for
替换为等效的while
循环应该可以消除与 Java 的性能差异。与 Java 的 for 循环相反,Scala 的 for 推导式实际上是高阶方法的语法糖;在本例中,您将调用
Range
对象的foreach
方法。 Scala 的 for 非常通用,但有时会导致性能不佳。您可能想尝试 Scala 2.9 版中的
-optimize
标志。观察到的性能可能取决于所使用的特定 JVM,以及 JIT 优化器是否有足够的“预热”时间来识别和优化热点。邮件列表上最近的讨论表明 Scala 团队正在致力于提高简单情况下的性能:
这是错误跟踪器中的问题:
https://issues.scala-lang.org/browse/SI-4633
更新 5 /28:
while
循环。The problem is most likely the use of a
for
comprehension in the methodisEvenlyDivisible
. Replacingfor
by an equivalentwhile
loop should eliminate the performance difference with Java.As opposed to Java's
for
loops, Scala'sfor
comprehensions are actually syntactic sugar for higher-order methods; in this case, you're calling theforeach
method on aRange
object. Scala'sfor
is very general, but sometimes leads to painful performance.You might want to try the
-optimize
flag in Scala version 2.9. Observed performance may depend on the particular JVM in use, and the JIT optimizer having sufficient "warm up" time to identify and optimize hot-spots.Recent discussions on the mailing list indicate that the Scala team is working on improving
for
performance in simple cases:Here is the issue in the bug tracker:
https://issues.scala-lang.org/browse/SI-4633
Update 5/28:
while
loops.作为后续,我尝试了 -optimize 标志,它将运行时间从 103 秒减少到 76 秒,但这仍然比 Java 或 while 循环慢 107 倍。
然后我正在研究“功能”版本:
并试图找出如何以简洁的方式摆脱“forall”。我惨遭失败,并想出了
我狡猾的 5 行解决方案已膨胀到 12 行。然而,这个版本的运行时间为0.71秒,与原始Java版本的速度相同,比上面使用“forall”的版本(40.2秒)快了56倍! (请参阅下面的编辑,了解为什么这比 Java 更快)
显然,我的下一步是将上面的内容翻译回 Java,但 Java 无法处理它并抛出一个 StackOverflowError,其中 n 约为 22000 标记。
然后我挠了挠头,用更多的尾递归替换了“while”,这节省了几行,运行速度也一样快,但让我们面对现实吧,读起来更混乱:
所以 Scala 的尾递归赢得了但令我惊讶的是,像“for”循环(和“forall”方法)这样简单的东西本质上已经被破坏了,必须被不优雅和冗长的“while”或尾递归所取代。我尝试 Scala 的很多原因是因为它的语法简洁,但如果我的代码运行速度慢 100 倍那就不好了!
编辑:(已删除)
编辑编辑:以前 2.5 秒和 0.7 秒的运行时间之间的差异完全是由于正在运行的 32 位或 64 位 JVM 造成的。用过的。命令行中的 Scala 使用 JAVA_HOME 设置的任何内容,而 Java 则使用 64 位(如果可用)。 IDE 有自己的设置。这里有一些测量:Eclipse 中的 Scala 执行时间
As a follow-up, I tried the -optimize flag and it reduced running time from 103 to 76 seconds, but that's still 107x slower than Java or a while loop.
Then I was looking at the "functional" version:
and trying to figure out how to get rid of the "forall" in a concise manner. I failed miserably and came up with
whereby my cunning 5-line solution has balooned to 12 lines. However, this version runs in 0.71 seconds, the same speed as the original Java version, and 56 times faster than the version above using "forall" (40.2 s)! (see EDIT below for why this is faster than Java)
Obviously my next step was to translate the above back into Java, but Java can't handle it and throws a StackOverflowError with n around the 22000 mark.
I then scratched my head for a bit and replaced the "while" with a bit more tail recursion, which saves a couple of lines, runs just as fast, but let's face it, is more confusing to read:
So Scala's tail recursion wins the day, but I'm surprised that something as simple as a "for" loop (and the "forall" method) is essentially broken and has to be replaced by inelegant and verbose "whiles", or tail recursion. A lot of the reason I'm trying Scala is because of the concise syntax, but it's no good if my code is going to run 100 times slower!
EDIT: (deleted)
EDIT OF EDIT: Former discrepancies between run times of 2.5s and 0.7s were entirely due to whether the 32-bit or 64-bit JVMs were being used. Scala from the command line uses whatever is set by JAVA_HOME, while Java uses 64-bit if available regardless. IDEs have their own settings. Some measurements here: Scala execution times in Eclipse
关于理解的答案是正确的,但这不是全部。您应该注意,在
isEvenlyDivisible
中使用return
并不是免费的。在for
中使用 return 会强制 scala 编译器生成非本地返回(即在其函数之外返回)。这是通过使用异常退出循环来完成的。如果您构建自己的控制抽象,也会发生同样的情况,例如:
这仅打印一次“Hi”。
请注意,
foo
中的return
退出foo
(这正是您所期望的)。由于括号中的表达式是一个函数文字,您可以在loop
的签名中看到,这会强制编译器生成非本地返回,即return
强制您退出foo
,而不仅仅是body
。在Java(即JVM)中,实现这种行为的唯一方法是抛出异常。
回到 isEvenlyDivisible:
if (a % i != 0) return false 是一个有返回值的函数文字,因此每次返回值被命中时,运行时必须抛出并捕获异常,这会导致相当多的 GC 开销。
The answer about for comprehension is right, but it's not the whole story. You should note note that the use of
return
inisEvenlyDivisible
is not free. The use of return inside thefor
, forces the scala compiler to generate a non-local return (i.e. to return outside it's function).This is done through the use of an exception to exit the loop. The same happens if you build your own control abstractions, for example:
This prints "Hi" only once.
Note that the
return
infoo
exitsfoo
(which is what you would expect). Since the bracketed expression is a function literal, which you can see in the signature ofloop
this forces the compiler to generate a non local return, that is, thereturn
forces you to exitfoo
, not just thebody
.In Java (i.e. the JVM) the only way to implement such behavior is to throw an exception.
Going back to
isEvenlyDivisible
:The
if (a % i != 0) return false
is a function literal that has a return, so each time the return is hit, the runtime has to throw and catch an exception, which causes quite a bit of GC overhead.我发现的一些加速
forall
方法的方法:原始:41.3 s
预实例化范围,这样我们就不会每次都创建一个新范围: >9.0 s
转换为列表而不是范围:4.8 s
我尝试了一些其他集合,但列表是最快的(尽管仍然比我们避免范围和高阶集合慢 7 倍)功能总共)。
虽然我是 Scala 的新手,但我猜想编译器可以通过简单地自动将方法中的 Range 文字(如上所述)替换为最外层作用域中的 Range 常量,轻松实现快速且显着的性能提升。或者更好的是,将它们像 Java 中的字符串文字一样进行实习。
脚注:
数组与 Range 大致相同,但有趣的是,使用新的
forall
方法(如下所示)可以使 64 位上的执行速度提高 24%,在 32 位上提高 8%。当我通过将因子数量从 20 减少到 15 来减少计算量时,差异消失了,所以这可能是垃圾收集效应。无论是什么原因,当长时间满负荷运行时,这一点都很重要。List 的类似皮条客也使性能提高了约 10%。
Some ways to speed up the
forall
method I discovered:The original: 41.3 s
Pre-instantiating the range, so we don't create a new range every time: 9.0 s
Converting to a List instead of a Range: 4.8 s
I tried a few other collections but List was fastest (although still 7x slower than if we avoid the Range and higher-order function altogether).
While I am new to Scala, I'd guess the compiler could easily implement a quick and significant performance gain by simply automatically replacing Range literals in methods (as above) with Range constants in the outermost scope. Or better, intern them like Strings literals in Java.
footnote:
Arrays were about the same as Range, but interestingly, pimping a new
forall
method (shown below) resulted in 24% faster execution on 64-bit, and 8% faster on 32-bit. When I reduced the calculation size by reducing the number of factors from 20 to 15 the difference disappeared, so maybe it's a garbage collection effect. Whatever the cause, it's significant when operating under full load for extended periods.A similar pimp for List also resulted in about 10% better performance.
我只是想对那些可能因此类问题而对 Scala 失去信心的人发表评论,因为几乎所有函数式语言的性能都会出现此类问题。如果你在 Haskell 中优化折叠,你通常必须将其重写为递归尾部调用优化循环,否则你将遇到性能和内存问题。
我知道不幸的是 FP 还没有优化到我们不必考虑这样的事情,但这根本不是 Scala 特有的问题。
I just wanted to comment for people who might lose faith in Scala over issues like this that these kinds of issues come up in the performance of just about all functional languages. If you are optimizing a fold in Haskell, you will often have to re-write it as a recursive tail-call-optimized loop, or else you'll have performance and memory issues to contend with.
I know it's unfortunate that FPs aren't yet optimized to the point where we don't have to think about things like this, but this is not at all a problem particular to Scala.
Scala 特有的问题已经讨论过,但主要问题是使用暴力算法并不是很酷。考虑一下(比原始 Java 代码快得多):
Problems specific to Scala have already been discussed, but the main problem is that using a brute-force algorithm is not very cool. Consider this (much faster than the original Java code):
尝试解决方案中给出的单行代码Scala for Project Euler
给出的时间至少比你的快,尽管远离 while 循环..:)
Try the one-liner given in the solution Scala for Project Euler
The time given is at least faster than yours, though far from the while loop.. :)