从并发角度看 Java 与 Scala
我现在正在开始我的最后一年的项目。我将从 java 和 scala 的角度研究并发方法。从java并发模块出来后,我明白为什么人们说共享状态线程方法很难推理。您需要担心关键部分,由于 java 线程的运行方式不确定,因此存在竞争条件和死锁等风险。在 1.5 中,这种推理变得更加清晰,但仍然远非十分清晰。
乍一看,scala 似乎通过 actor 类消除了这种复杂的推理。这使程序员能够从更顺序的角度开发并发系统,并且更容易概念化。但是,就这一积极因素而言,我说的有一些缺点吗?例如,假设我们想在这两种情况下对一个大列表进行排序 - 使用 java,您创建两个线程将列表分成两部分,担心关键部分、原子操作等,然后执行代码。使用 scala,因为它“不共享任何内容”,所以实际上必须将 list/2 传递给两个参与者来执行排序操作,对吗?
我想我的问题是,你为更简单的推理付出的代价是必须将集合传递给你的演员的性能开销,在scala中?
我正在考虑为此进行一些基准测试(选择排序、快速排序等),但因为一个是功能性的,一个是命令性的 - 我不会从算法的角度将苹果与苹果进行比较。
我非常感谢你们对上述内容的任何看法,以便给我一些开始的想法。 非常感谢。
I am kicking off my final year project right now. I am going to be investigating the concurrency approaches from java and scala perspectives. Having come out of a java concurrency module, I can see why people say that the shared state threading approach is difficult to reason about. You have critical sections to worry about, run the risk of race conditions and deadlocks etc due to the non deterministic way in which java threads operate. With 1.5 this reasoning was given some clarity ,but still, far from crystal clear.
At first view, scala appears to remove this complex reasoning through the actors class. This has given the programmer the ability to develop concurrent systems from a more sequential viewpoint and easier to conceptualize. But, for this positive, am I right in saying that there are some drawbacks? For instance, say we want to sort a large list in both scenarios - with java you create two threads split the list in two, worry about the critical sections, atomic actions etc and go code. With scala, because it is "share nothing" you actually have to pass the list/2 to two actors to peform the sort operation, right?
I guess my question is that the price you pay for simpler reasoning is performance overhead of having to pass the collection to your actors, in scala?
I was thinking of doing some benchmark tests to this effect (selection sort, quick sort etc;) but because one is functional and one is imperative - I will not be comparing apples with apples from an algorithm viewpoint.
I would really appreciate any views you guys have on the above to give me some ideas to get me started.
Many thanks.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
Scala 的好处是,如果您愿意,您可以用 Java 的方式实现并发。所有 Java 类都可用。
因此,它实际上可以归结为具有并发访问可变变量的线程的模型与具有相互发送消息但不窥视彼此内部的有状态参与者的模型之间的区别。您是绝对正确的,在某些情况下,您必须在性能与使代码正确的容易性之间进行权衡。
我通常发现一个粗略的经验法则是,如果您要使用 Java 模型让一堆线程花费大量时间等待锁打开,并且没有干净的方法来分离工作为了避免让每个人都等待该资源,并且如果执行在线程之间快速切换,那么 Java 模型远远优于参与者模型,在参与者模型中,参与者将“我完成了”消息发送回主管,然后主管发送出去“这是新作品!”向现有的不忙演员发送消息。排序算法,取决于你如何设想它们,很可能属于这一类。
对于大多数其他事情,据我所知,与演员相关的表演损失并不大。如果您可以将问题视为大量的反应性元素(即,它们只需要在收到消息时需要时间),那么参与者可以很好地扩展(可用的数以百万计,尽管只有少数)在任何给定时刻都在工作);对于线程,您需要某种额外的内部状态来跟踪谁应该做什么工作,因为您无法处理那么多活动线程。
The nice thing about Scala is that you can do concurrency the Java way if you want. All the Java classes are available.
So it really boils down to the difference between a model where you have threads with concurrent access to mutable variables, and a model where you have stateful actors which send messages to each other but do not peek into each others' internals. And you're absolutely right that in some scenarios you have to trade off performance against ease of getting the code correct.
I generally find as a rough rule of thumb that if you're going to have a pile of threads spending a significant amount of time waiting for a lock to open up, using a Java model, and there is no clean way to separate the work to avoid having everyone waiting for that resource, and if the execution switches between threads quickly, then the Java model is far superior to an actor model where the actor sends an "I'm done" message back to a supervisor, which then sends out a "Here's new work!" message to an existing non-busy actor. Sorting algorithms, depending on how you envision them, can very much fall into this category.
For most everything else, the performance penalty associated with actors doesn't amount to much as far as I've seen. If you can conceive of your problem as lots and lots of reactive elements (i.e. they only need time when they've received a message), then actors can scale particularly well (millions available, though only a handful are working at any given instant); with threads, you'd need to have some sort of extra internal state to keep track of who should be doing what work, since you couldn't handle that many active threads.
我只是想在这里指出,Scala 不会复制传递给 actor 的参数,因此 actor 可以共享传递给他们的任何参数。
与 Erlang 不同,程序员有责任避免共享可变的东西。然而,共享不可变的东西不会有任何损失,因为不需要锁定它,因为对它的所有访问都是只读的。而且Scala 对不可变数据结构有强大的支持。
I'm just going to point out here that Scala does not copy arguments passed to actors, so actors can share whatever it is passed to them.
As opposed to Erlang, it is the programmer's responsibility to avoid sharing mutable stuff. However, there is no penalty in sharing immutable stuff, since there's no need to lock it, as all accesses to it are read-only. And Scala has strong support for immutable data structures.