使用并行集合时会批量执行哪些操作?这里的行为很奇怪
在 Scala REPL 中输入以下小顺序程序及其并行化版本:
/* Activate time measurement in "App" class. Prints [total <X> ms] on exit. */
util.Properties.setProp("scala.time", "true")
/* Define sequential program version. */
object X extends App { for (x <- (1 to 10)) {Thread.sleep(1000);println(x)}}
/* Define parallel program version. Note '.par' selector on Range here. */
object Y extends App { for (y <- (1 to 10).par) {Thread.sleep(1000);println(y)}}
使用 X.main(Array.empty)
执行 X 给出:
1
2
3
4
5
6
7
8
9
10
[total 10002ms]
而使用 Y.main(Array.empty)
执行 Y code> 给出:
1
6
2
7
3
8
4
9
10
5
[total 5002ms]
到目前为止一切顺利。但是该程序的以下两个变体怎么样:
object X extends App {(1 to 10).foreach{Thread.sleep(1000);println(_)}}
object Y extends App {(1 to 10).par.foreach{Thread.sleep(1000);println(_)}}
给我运行时 [total 1002ms]
和 [total 1002ms]
< /strong> 分别。 怎么会这样?
Input the following little sequential program and its parallelized version in Scala REPL:
/* Activate time measurement in "App" class. Prints [total <X> ms] on exit. */
util.Properties.setProp("scala.time", "true")
/* Define sequential program version. */
object X extends App { for (x <- (1 to 10)) {Thread.sleep(1000);println(x)}}
/* Define parallel program version. Note '.par' selector on Range here. */
object Y extends App { for (y <- (1 to 10).par) {Thread.sleep(1000);println(y)}}
Executing X with X.main(Array.empty)
gives:
1
2
3
4
5
6
7
8
9
10
[total 10002ms]
Whereas Y with Y.main(Array.empty)
gives:
1
6
2
7
3
8
4
9
10
5
[total 5002ms]
So far so good. But what about the following two variations of the program:
object X extends App {(1 to 10).foreach{Thread.sleep(1000);println(_)}}
object Y extends App {(1 to 10).par.foreach{Thread.sleep(1000);println(_)}}
The give me runtimes of [total 1002ms]
and [total 1002ms]
respectively. How can this be?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
这与并行集合无关。问题隐藏在函数字面量中。如果让编译器显示 AST(使用选项
-Xprint:typer
),您可以看到它:生成
与
生成
略有不同。如果你想要预期的结果,你必须将 foreach 表达式更改为
但是有什么区别呢?在代码中,您向
foreach
声明一个块,执行该块后它将返回要执行的函数。然后这个返回的函数被传递到 foreach 而不是包含它的块。这种错误经常犯。它与下划线文字有关。也许这个问题对您有帮助。
This have nothing to do with parallel collections. The problem is hidden in the function literal. You can see it if you let the compiler show the AST (with option
-Xprint:typer
):produces
whereas
produces
There is a little difference. If you want the expected result you have to change the foreach-expression to
But what is the difference? In your code you declare a block to
foreach
and after executing the block it will return the function to execute. Then this returned function is delivered toforeach
and not the block which contains it.This mistake is often done. It has to do with the underscore literal. Maybe this question helps you.
一种有趣的思考方式是,因为 scala 是按值调用 (Scala 中按名称调用与按值调用,需要澄清)当您将 {Thread.sleep(1000);println()} 传递给您评估的 foreach 时仅执行一次块 {Thread.sleep(1000);println()} 并将生成的 println(_) 函数传递给 foreach。当您执行 foreach(x => Thread.sleep(1000); println(x)) 时,您将 Thread.sleep(1000) 以及 println(x) 传递到函数 foreach 中。这只是 sschaef 已经说过的话的另一种表达方式。
An interesting way of thinking about it is that because scala is call-by-value (Call by name vs call by value in Scala, clarification needed) when you hand {Thread.sleep(1000);println()} to foreach you evaluate the the block {Thread.sleep(1000);println()} only once and hand only the resulting println(_) function to foreach. When you do foreach(x => Thread.sleep(1000); println(x)) you are handing Thread.sleep(1000) as well as the println(x) into the function foreach. This is just another way of saying what sschaef already said.