为什么无锁并发如此重要(在 Clojure 中)?
我听说 Clojure 具有无锁并发性,这很重要。
我使用过多种语言,但没有意识到它们在幕后执行锁定。
为什么这在 Clojure(或任何具有此功能的语言)中是一个优势?
I'm told that Clojure has lockless concurrency and that this is Important.
I've used a number of languages but didn't realize they were performing locks behind the scenes.
Why is this an advantage in Clojure (or in any language that has this feature)?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(7)
无锁并发还提供了一个很好的优势,即读取器永远不必等待其他读取器。当许多线程从单个源读取数据时,这特别有用。您仍然需要在程序中定义数据依赖性,并显式定义可以安全交换的事务部分。
STM 可以让您避免死锁和几乎所有活锁的发生,尽管它不能让您避免并发失败,您仍然可以创建事务失败的情况,因为它缺乏维护其历史记录的资源,但重要的是并发失败将是明确的,您可以从中恢复
Lockless concurrency also provides the nice advantage that readers never have to wait for other readers. This is especially useful when many threads will be reading data from a single source. You still need to define the data dependencies in your program and explicitly define the parts of a transaction that can be commuted safely.
STM saves you from deadlocks and almost all occurrences of livelock though it does not save you from concurrency failures you can still create cases where a transaction will fail because it lacks the resources to maintain its history but the important part is that concurrency failures will be explicit and you can recover from them
我不能具体谈论 Clojure,但是……这意味着您不需要等待某人完成某件事才能开始工作。这太棒了。
通常它是通过不可变类型来实现的。如果无法修改任何内容,那么您实际上不需要等到其他人完成后才能访问它。
I can't speak about Clojure specifically, but ... it means you don't need to wait for someone to be done with something before you can get to work. Which is great.
Typically it's achieved with immutable types. If nothing can be modified, you don't really need to wait till someone else is done with it before you can access it.
僵局。或者更正确地说,缺乏它们。
大多数语言中最大的问题之一是最终会遇到死锁:
现在没有锁了,显然你不会遇到死锁。
Deadlocks. Or to be more correct the lack of them.
One of the biggest problems in most languages is that you end up with deadlocks that are:
Now with no locks, obviously you won't run into deadlocks.
最重要的是锁不能组合。
虽然使用简单的锁定策略编写代码很简单(例如将其放入同步的 Java 类中......),但当您开始锁定多个对象并开始创建组合不同锁定的复杂事务时,它会变得指数级地复杂。运营。可能会发生死锁,性能受到影响,锁定逻辑开始使代码变得极其复杂,并且在某些时候代码开始变得难以维护。
对于任何必须构建大型复杂并发系统的人来说,这些问题都会变得显而易见(解决这些问题是 Rich Hickey 创建 Clojure 的主要动机)。
第二个问题是性能。
锁定和 STM 都明显增加了开销。但在某些重要情况下,STM 开销可能要低得多。
特别是,无锁并发(与 Clojure STM 一样)通常意味着如果读取器访问事务之外的数据,则不会受到任何其他线程(包括写入器!)的损害。在相当常见的情况下,这可能是一个巨大的胜利,即读取不需要事务性并且数量大大超过写入(想想大多数 Web 应用程序......)。 Clojure 中 STM 引用的非事务性读取本质上是免费的。
The biggest deal is that locks don't compose.
While it's trivial to write code with a simple locking strategy (e.g. put it in a synchronized Java class.....), it gets exponentially more complicated as you start to lock multiple objects, and start to create complex transactions that combine different locked operations. Deadlocks can occur, performance suffers, locking logic starts to make the code extremely convoluted and at some point the code starts to become unmaintainable.
These problems will become apparent to anyone who has to build a large, complex concurrent system (and solving them was a major motivation for Rich Hickey in creating Clojure).
The second issue is performance.
Both locking and STM clearly impose overhead. But in some important cases the STM overhead can be much lower.
In particular, lockless concurrency (as with Clojure STM) usually implies that readers are not impaired by any other threads (including writers!) if they access data outside a transaction. This can be a huge win in the fairly common case that reads don't need to be transactional and dramatically outnumber writes (think most web applications.....). Non-transactional reads of an STM reference in Clojure are essentially overhead free.
只要您编写严格顺序的程序(执行 A,然后 B,然后 C;完成!),您就不会遇到并发问题,并且语言的并发机制仍然无关紧要。
当您从“编程练习”程序毕业到现实世界时,很快您就会遇到解决方案是多线程(或任何可用的并发方式)的问题。
案例:带有 GUI 的程序。假设您正在编写一个具有拼写检查功能的编辑器。您希望拼写检查器在后台安静地执行其操作,但您希望 GUI 能够顺利地接受用户输入。因此,您将这两个活动作为单独的线程运行。
案例:我最近编写了一个程序(用于工作),该程序从两个日志文件收集统计信息并将其写入数据库。每个文件大约需要 3 分钟来处理。我将这些进程转移到两个并行运行的线程中,将总处理时间从 6 分钟减少到 3 分钟多一点。
案例:科学/工程模拟软件。通过计算代表测试对象(星核、核爆炸、昆虫种群的地理分布......)的 3 维网格中每个点的某些效应(例如热流),可以解决很多很多问题。基本上,在每个点和很多点上都会进行相同的计算,因此并行完成它们是有意义的。
在所有这些情况以及更多情况下,每当两个计算进程大致同时访问相同的内存(=变量,如果您愿意的话)时,它们就有可能相互干扰并搞乱彼此的工作。计算机科学的一个庞大分支涉及“并发编程”,它涉及如何解决此类问题的想法。
关于此主题的相当有用的起始讨论可以在维基百科中找到。
As long as you write strictly sequential programs (do A, then B, then C; finished!) you don't have concurrency problems, and a language's concurrency mechanisms remain irrelevant.
When you graduate from "programming exercise" programs to real world stuff, pretty soon you encounter problems whose solution is multi-threading (or whatever flavor of concurrency you have available).
Case: Programs with a GUI. Say you're writing an editor with spell checking. You want the spell checker to be quietly doing its thing in the background, yet you want the GUI to smoothly accept user input. So you run those two activities as separate threads.
Case: I recently wrote a program (for work) that gathers statistics from two log files and writes them to a database. Each file takes about 3 minutes to process. I moved those processes into two threads that run side by side, cutting total processing time from 6 minutes to a little over 3.
Case: Scientific/engineering simulation software. There are lots and lots of problems that are solved by calculating some effect (heat flow, say) at every point in a 3 dimensional grid representing your test subject (star nucleus, nuclear explosion, geographic dispersion of an insect population...). Basically the same computation is done at every point, and at lots of points, so it makes sense to have them done in parallel.
In all those cases and many more, whenever two computing processes access the same memory (= variables, if you like) at roughly the same time there is potential for them interfering with each other and messing up each others' work. The huge branch of Computer Science that deals with "concurrent programming" deals with ideas on how to solve this kind of problem.
A reasonably useful starting discussion of this topic can be found in Wikipedia.
无锁并发的好处是程序不复杂。在命令式语言中,并发编程依赖于锁,一旦程序变得相当复杂,难以修复的死锁错误就会出现。
The benefit of lockless concurrency is the lack of complexity in the program. In imperative languages, concurrent programming relies on locks, and once the program gets even moderately complex, difficult-to-fix deadlock bugs creep in.
这种“无锁并发”实际上并不是语言的特性;而是一种语言的特性。相反,它是平台或运行时环境的一项功能,而不幸的是,语言无法让您访问这些设施。
考虑基于锁和无锁并发之间的权衡类似于元循环求值器问题:可以用原子操作(例如比较和交换,或 CAS)来实现锁,也可以用原子操作来实现原子操作的锁。哪个应该在底部?
Such "lockless concurrency" isn't really a feature of a language; rather, it's a feature of a platform or runtime environment, and woe be the language that won't get out of the way to give you access to these facilities.
Thinking about the trades between lock-based and lock-free concurrency is analogous to the metacircular evaluator problem: one can implement locks in terms of atomic operations (e.g. compare-and-swap, or CAS), and one can implement atomic operations in terms of locks. Which should be at the bottom?