确保线程安全的最佳编程方法/方法论
当我学习 Java 时,我已经有大约 20 年的使用基础、Pascal、COBOL 和 C 进行过程编程的背景,当时我认为最难的事情是让我的头脑陷入 OOP 术语和概念。 现在,我拥有大约 8 年扎实的 Java 经验,我得出的结论是,使用 Java 和 C# 等类似语言进行编程时,最困难的事情是多线程/并发方面。
编写可靠且可扩展的多线程应用程序非常困难! 随着处理器变得“更宽”而不是更快的趋势,它正迅速变得至关重要。
当然,最困难的领域是控制线程之间的交互以及由此产生的错误:死锁、竞争条件、陈旧数据和延迟。
所以我向您提出的问题是:您采用什么方法或方法来生成安全的并发代码,同时减少死锁、延迟和其他问题的可能性? 我提出了一种有点不传统的方法,但在几个大型应用程序中效果很好,我将在这个问题的详细答案中分享。
When I was learning Java coming from a background of some 20 years of procedural programming with basic, Pascal, COBOL and C, I thought at the time that the hardest thing about it was wrapping my head around the OOP jargon and concepts. Now with about 8 years of solid Java under my belt, I have come to the conclusion that the single hardest thing about programming in Java and similar languages like C# is the multithreaded/concurrent aspects.
Coding reliable and scalable multi-threaded applications is just plain hard! And with the trend for processors to grow "wider" rather than faster, it is rapidly becoming just plain critical.
The hardest area is, of course, controlling interactions between threads and the resulting bugs: deadlocks, race conditions, stale data and latency.
So my question to you is this: what approach or methodology do you employ for producing safe concurrent code while mitigating the potential for deadlocks, latency, and other problems? I have come up with an approach which is a little unconventional but has worked very well in several large applications, which I will share in a detailed answer to this question.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(15)
这不仅适用于 Java,而且适用于一般的线程编程。 我发现自己只需遵循以下准则就可以避免大多数并发和延迟问题:
1/让每个线程运行自己的生命周期(即决定何时终止)。 它可以从外部提示(例如标志变量),但它完全负责。
2/ 让所有线程以相同的顺序分配和释放其资源 - 这保证不会发生死锁。
3/ 尽可能短的时间锁定资源。
4/ 将数据责任与数据本身一起传递 - 一旦您通知线程该数据需要处理,不要管它,直到责任返回给您。
This not only applies to Java but to threaded programming in general. I find myself avoiding most of the concurrency and latency problems just by following these guidelines:
1/ Let each thread run its own lifetime (i.e., decide when to die). It can be prompted from outside (say a flag variable) but it in entirely responsible.
2/ Have all threads allocate and free their resources in the same order - this guarantees that deadlock will not happen.
3/ Lock resources for the shortest time possible.
4/ Pass responsibility for data with the data itself - once you notify a thread that the data is its to process, leave it alone until the responsibility is given back to you.
有许多技术刚刚进入公众意识(例如:过去几年)。 其中一个大人物就是演员。 这是 Erlang 首先带来的东西,但它已经被像 Scala(JVM 上的参与者)这样的新语言发扬光大。 虽然参与者确实不能解决所有问题,但它们确实使推理代码和识别问题点变得更加容易。 它们还使设计并行算法变得更加简单,因为它们强制您使用连续传递共享可变状态的方式。
Fork/Join 是您应该关注的内容,特别是如果您使用的是 JVM。 道格·李 (Doug Lea) 撰写了有关该主题的开创性论文,但多年来许多研究人员对此进行了讨论。 据我了解,Doug Lea 的参考框架计划包含在 Java 7 中。
在侵入性稍小的层面上,简化多线程应用程序所需的唯一步骤通常只是降低锁定的复杂性。 细粒度锁定(Java 5 风格)对于吞吐量来说非常有用,但很难做到正确。 软件事务内存 (STM) 是另一种锁定方法,通过 Clojure 获得了一些关注。 这本质上与传统锁定相反,因为它是乐观的而不是悲观的。 首先假设不会发生任何冲突,然后允许框架在问题发生时解决问题。 数据库通常以这种方式工作。 它对于低冲突率系统的吞吐量非常有用,但最大的胜利在于算法的逻辑组件化。 您不必将一个锁(或一系列锁)与某些数据任意关联,只需将危险代码包装在事务中,然后让框架计算其余部分。 您甚至可以从像 GHC 的 STM monad 或我的实验性 Scala STM 等不错的 STM 实现中获得相当多的编译时检查。
有很多用于构建并发应用程序的新选项,您选择哪一个很大程度上取决于您的专业知识、您的语言以及您想要建模的问题类型。 一般来说,我认为参与者与持久、不可变的数据结构相结合是一个可靠的选择,但正如我所说,STM 的侵入性要小一些,有时可以产生更直接的改进。
There are a number of techniques which are coming into the public consciousness just now (as in: the last few years). A big one would be actors. This is something that Erlang first brought to the grid iron but which has been carried forward by newer languages like Scala (actors on the JVM). While it is true that actors don't solve every problem, they do make it much easier to reason about your code and identify trouble spots. They also make it much simpler to design parallel algorithms because of the way they force you to use continuation passing over shared mutable state.
Fork/Join is something you should look at, especially if you're on the JVM. Doug Lea wrote the seminal paper on the topic, but many researchers have discussed it over the years. As I understand it, Doug Lea's reference framework is scheduled for inclusion into Java 7.
On a slightly less-invasive level, often the only steps necessary to simplify a multi-threaded application are just to reduce the complexity of the locking. Fine-grained locking (in the Java 5 style) is great for throughput, but very very difficult to get right. One alternative approach to locking which is gaining some traction through Clojure would be software-transactional memory (STM). This is essentially the opposite of conventional locking in that it is optimistic rather than pessimistic. You start out by assuming that you won't have any collisions, and then allow the framework to fix the problems if and when they occur. Databases often work this way. It's great for throughput on systems with low collision rates, but the big win is in the logical componentization of your algorithms. Rather than arbitrarily associating a lock (or a series of locks) with some data, you just wrap the dangerous code in a transaction and let the framework figure out the rest. You can even get a fair bit of compile-time checking out of decent STM implementations like GHC's STM monad or my experimental Scala STM.
There are a lot of new options for building concurrent applications, which one you pick depends greatly on your expertise, your language and what sort of problem you're trying to model. As a general rule, I think actors coupled with persistent, immutable data structures are a solid bet, but as I said, STM is a little less invasive and can sometimes yield more immediate improvements.
Java 中的线程安全没有唯一正确的答案。 然而,至少有一本非常伟大的书:Java 并发实践。 我经常参考它(尤其是旅行时的在线 Safari 版本)。
我强烈建议您深入阅读这本书。 您可能会发现您的非常规方法的成本和收益得到了深入研究。
There is no One True Answer for thread safety in Java. However, there is at least one really great book: Java Concurrency in Practice. I refer to it regularly (especially the online Safari version when I'm on travel).
I strongly recommend that you peruse this book in depth. You may find that the costs and benefits of your unconventional approach are examined in depth.
我通常遵循 Erlang 风格的方法。 我使用主动对象模式。
其工作原理如下。
将您的应用程序划分为非常粗粒度的单元。 在我当前的应用程序之一(400.000 LOC)中,我有大约。 8 个粗粒度单元。 这些单位根本不共享任何数据。 每个单位都保留自己的本地数据。 每个单元都在自己的线程上运行(= 活动对象模式),因此是单线程的。 单元内不需要任何锁。 当单元需要向其他单元发送消息时,它们通过将消息发布到其他单元的队列来实现。 另一个单元从队列中选取消息并对该消息做出反应。 这可能会向其他单位触发其他消息。
因此,此类应用程序中唯一的锁位于队列周围(每个单元一个队列和锁)。 根据定义,该架构是无死锁的!
该架构的扩展性非常好,一旦您了解了基本原理,就非常容易实现和扩展。 它喜欢将其视为应用程序中的 SOA。
通过将您的应用程序划分为记住的单元。 每个 CPU 核心的长时间运行线程的最佳数量是 1。
I typically follow an Erlang style approach. I use the Active Object Pattern.
It works as follows.
Divide your application into very coarse grained units. In one of my current applications (400.000 LOC) I have appr. 8 of these coarse grained units. These units share no data at all. Every unit keeps its own local data. Every unit runs on its own thread (= Active Object Pattern) and hence is single threaded. You don't need any locks within the units. When the units need to send messages to other units they do it by posting a message to a queue of the other units. The other unit picks the message from the queue and reacts on that message. This might trigger other messages to other units.
Consequently the only locks in this type of application are around the queues (one queue and lock per unit). This architecture is deadlock free by definition!
This architecture scales extremely well and is very easy to implement and extend as soon as you understood the basic principle. It like to think of it as a SOA within an application.
By dividing your app into the units remember. The optimum number of long running threads per CPU core is 1.
我推荐基于流的编程,也称为数据流编程。 它使用 OOP 和线程,我觉得这是一个自然的进步,就像 OOP 到过程一样。 不得不说,数据流编程并不能万能,它不是通用的。
维基百科对此主题有很好的文章:
http://en.wikipedia.org/wiki/Dataflow_programming
http://en.wikipedia.org/wiki/Flow-based_programming
另外,它有几个优点,如令人难以置信的灵活配置、分层; 程序员(组件程序员)不必对业务逻辑进行编程,它是在另一个阶段完成的(将处理网络放在一起)。
您知道吗,make 是一个数据流系统? 请参阅make -j,特别是如果您有多核处理器。
I recommend flow-based programming, aka dataflow programming. It uses OOP and threads, I feel it like a natural step forward, like OOP was to procedural. Have to say, dataflow programming can't be used for everything, it is not generic.
Wikipedia has good articeles on the topic:
http://en.wikipedia.org/wiki/Dataflow_programming
http://en.wikipedia.org/wiki/Flow-based_programming
Also, it has several advantages, as the incredible flexibile configuration, layering; the programmer (Component programmer) has not to program the business logic, it's done in another stage (putting the processing network together).
Did you know, make is a dataflow system? See make -j, especially if you have multi-core processor.
非常……仔细地编写多线程应用程序中的所有代码! 我不知道还有什么比这更好的答案了。 (这涉及jonnii提到的东西)。
我听到人们争论(并同意他们的观点)传统的线程模型在未来确实行不通,所以我们将不得不开发一套不同的范式/语言来真正使用这些新奇的多线程模型。有效地核心。 像 Haskell 这样的语言,其程序很容易并行化,因为任何有副作用的函数都必须以这种方式显式标记,还有 Erlang,不幸的是我对它了解不多。
Writing all the code in a multi-threaded application very... carefully! I don't know any better answer than that. (This involves stuff like jonnii mentioned).
I've heard people argue (and agree with them) that the traditional threading model really won't work going into the future, so we're going to have to develop a different set of paradigms / languages to really use these newfangled multi-cores effectively. Languages like Haskell, whose programs are easily parallelizable since any function that has side effects must be explicitly marked that way, and Erlang, which I unfortunately don't know that much about.
我建议演员模型。
I suggest the actor model.
演员模型 是您正在使用的,它是迄今为止最简单(且有效的方法)对于多线程的东西。 基本上每个线程都有一个(同步)队列(它可以依赖于操作系统,也可以不依赖于操作系统),其他线程生成消息并将它们放入将处理该消息的线程的队列中。
基本示例:
这是生产者消费者问题的典型示例。
The actor model is what you are using and it is by far the simplest (and efficient way) for multithreading stuff. Basically each thread has a (synchronized) queue (it can be OS dependent or not) and other threads generate messages and put them in the queue of the thread that will handle the message.
Basic example:
It is a tipical example of producer consumer problem.
这显然是一个难题。 除了明显需要小心之外,我认为第一步是准确定义您需要什么线程以及原因。
像设计类一样设计线程:确保您知道是什么使它们保持一致:它们的内容以及它们与其他线程的交互。
It is clearly a difficult problem. Apart from the obvious need for carefulness, I believe that the very first step is to define precisely what threads you need and why.
Design threads as you would design classes : making sure you know what makes them consistent : their contents and their interactions with other threads.
我记得当我发现 Java 的synchronizedList 类不是完全线程安全的,而只是有条件线程安全时,我感到有些震惊。 如果我不将访问(迭代器、设置器等)包装在同步块中,我仍然可能会被烧伤。 这意味着我可能已经向我的团队和管理层保证我的代码是线程安全的,但我可能错了。 确保线程安全的另一种方法是使用工具来分析代码并使其通过。 STP、Actor 模型、Erlang 等是获得后一种保证形式的一些方法。 能够可靠地保证程序的属性是/将是编程方面的一大进步。
I recall being somewhat shocked to discover that Java's synchronizedList class wasn't fully thread-safe, but only conditionally thread-safe. I could still get burned if I didn't wrap my accesses (iterators, setters, etc.) in a synchronized block. This means that I might've assured my team and my management that my code was thread safe, but I might've been wrong. Another way I can assure thread safety is for a tool to analyse the code and have it pass. STP, Actor model, Erlang, etc are some ways of getting the latter form of assurance. Being able to assure properties of a program reliably is/will be a huge step forward in programming.
看起来你的 IOC 有点像 FBP :-) 如果 JavaFBP 代码能够得到像你这样精通线程安全代码编写艺术的人的彻底审查,那就太棒了……它位于 SourceForge 的 SVN 上。
Looks like your IOC is somewhat FBP-like :-) It would be fantastic if the JavaFBP code could get a thorough vetting from someone like yourself versed in the art of writing thread-safe code... It's on SVN in SourceForge.
一些专家认为您问题的答案是完全避免线程,因为几乎不可能避免不可预见的问题。 引用线程的问题:
我们开发了一个流程,其中包括
代码成熟度评级系统(红、黄、绿、蓝四个级别)、设计评审、代码
评论、夜间构建、回归测试和自动代码覆盖率指标。 该部分
确保程序结构一致的内核于 2000 年初编写,
设计审查为黄色,代码审查为绿色。 审稿人包括并发专家、
不仅仅是缺乏经验的研究生(Christopher Hylands(现为 Brooks)、Bart Kienhuis、John
Reekie 和 [Ed Lee] 都是审稿人)。 我们编写了实现 100% 代码的回归测试
覆盖范围...
……系统本身开始被广泛使用,系统的每一次使用都行使了这一点
代码。 直到四年后的 2004 年 4 月 26 日代码陷入僵局,才发现任何问题。
Some experts feel the answer to your question is to avoid threads altogether, because it's almost impossible to avoid unforseen problems. To quote The Problem with Threads:
We developed a process that included
a code maturity rating system (with four levels, red, yellow, green, and blue), design reviews, code
reviews, nightly builds, regression tests, and automated code coverage metrics. The portion
of the kernel that ensured a consistent view of the program structure was written in early 2000,
design reviewed to yellow, and code reviewed to green. The reviewers included concurrency experts,
not just inexperienced graduate students (Christopher Hylands (now Brooks), Bart Kienhuis, John
Reekie, and [Ed Lee] were all reviewers). We wrote regression tests that achieved 100 percent code
coverage...
The... system itself began to be widely used, and every use of the system exercised this
code. No problems were observed until the code deadlocked on April 26, 2004, four years later.
设计具有多线程的新应用程序的最安全方法是遵守以下规则:
设计低于设计。
这是什么意思?
想象一下,您确定了应用程序的主要构建块。 让它成为 GUI,一些计算引擎。 通常,一旦团队规模足够大,团队中的一些人就会要求“库”来在这些主要构建块之间“共享代码”。 虽然一开始为主要构建块定义线程和协作规则相对容易,但所有这些工作现在都处于危险之中,因为“代码重用库”将设计得很糟糕,需要时才设计,并且充斥着锁和互斥锁, “感觉不错”。
这些临时库是您设计之下的设计,也是线程架构的主要风险。
该怎么办呢?
最后,考虑在主要构建块之间进行一些基于消息的交互; 例如,参见经常提到的演员模型。
The safest approach to design new applications with multi threading is to adhere to the rule:
No design below the design.
What does that mean?
Imagine you identified major building blocks of your application. Let it be the GUI, some computations engines. Typically, once you have a large enough team size, some people in the team will ask for "libraries" to "share code" between those major building blocks. While it was relatively easy in the start to define the threading and collaboration rules for the major building blocks, all that effort is now in danger as the "code reuse libraries" will be badly designed, designed when needed and littered with locks and mutexes which "feel right".
Those ad-hoc libraries are the design below your design and the major risk for your threading architecture.
What to do about it?
Last not least, consider to have some message based interaction between your major building blocks; see the often mentioned actor model, for example.
我认为核心问题是(a)避免死锁和(b)在线程之间交换数据。 出租人关心的一个问题(但只是稍微次要的)是避免瓶颈。 我已经遇到过几个不同的乱序锁定导致死锁的问题 - 说“总是以相同的顺序获取锁”很好,但在中型到大型系统中,实际上通常不可能确保这一点。
警告:当我想出这个解决方案时,我必须以 Java 1.1 为目标(因此 Doug Lea 眼中还没有出现并发包)——手头的工具是完全同步和等待/通知的。 我借鉴了使用基于实时消息的系统 QNX 编写复杂的多进程通信系统的经验。
根据我使用 QNX 的经验,QNX 存在死锁问题,但通过将消息从一个进程的内存空间复制到另一个进程的内存空间来避免数据并发,我想出了一种基于消息的对象方法 - 我称之为 IOC,用于对象间协调。 一开始我设想我可能会像这样创建所有我的对象,但事后看来,它们只在大型应用程序的主要控制点 - “州际交汇处”,如果你会,并不适合道路系统中的每个“交叉口”。 事实证明这是一个很大的好处,因为它们完全不是 POJO。
我设想了一个系统,其中对象在概念上不会调用同步方法,而是“发送消息”。 消息可以是发送/回复的,即发送者在处理消息时等待并返回回复,也可以是异步的,即消息被放入队列中并在稍后阶段出队并进行处理。 请注意,这是概念上的区别 - 消息传递是使用同步方法调用实现的。
消息传递系统的核心对象是IsolatedObject、IocBinding 和IocTarget。
之所以称为IsolatedObject,是因为它没有公共方法; 它是为了接收和处理消息而扩展的。 使用反射进一步强制子对象没有公共方法,也没有任何包或受保护的方法,除了从isolatedobject继承的方法之外,几乎所有这些方法都是最终的; 一开始看起来很奇怪,因为当您子类化isolatedobject时,您创建了一个具有1个受保护方法的对象:
并且所有其余方法都是处理特定消息的私有方法。
IocTarget 是一种抽象隔离对象可见性的方法,对于为另一个对象提供自引用以将信号发送回给您非常有用,而无需暴露您的实际对象引用。
IocBinding 只是将发送者对象绑定到消息接收者,这样就不会为发送的每条消息进行验证检查,并且 IocBinding 是使用 IocTarget 创建的。
与隔离对象的所有交互都是通过“发送”消息来实现的 - 接收者的 processIocMessage 方法是同步的,这确保一次仅处理一条消息。
创建了一种情况,其中孤立对象完成的所有工作都通过单个方法进行集中,接下来我通过它们在构造时声明的“分类”将对象排列在声明的层次结构中 - 只是一个将它们标识为以下之一的字符串任意数量的“消息接收者类型”,它将对象放置在某个预定的层次结构中。 然后,我使用消息传递代码来确保,如果发送者本身是一个独立对象,那么对于同步发送/回复消息,它是层次结构中较低的对象。 异步消息(信号)使用线程池中的单独线程分派到消息接收者,线程池的整个作业传递信号,因此信号可以从任何对象发送到系统中的任何接收者。 信号可以传递任何想要的消息数据,但不回复是可能的。
因为消息只能以向上方向传递(并且信号总是向上,因为它们是由专门为此目的运行的单独线程传递的),因此设计消除了死锁。
由于线程之间的交互是通过使用 Java 同步交换消息来完成的,因此竞争条件和陈旧数据问题同样可以通过设计消除。
因为任何给定的接收者一次只处理一个消息,并且因为它没有其他入口点,所以消除了所有对对象状态的考虑——实际上,对象是完全同步的,并且同步不会意外地从任何方法中遗漏; 没有 getter 返回过时的缓存线程数据,也没有 setter 在另一个方法对其进行操作时更改对象状态。
因为只有主要组件之间的交互是通过这种机制进行传递的,所以在实践中,它的扩展性非常好——这些交互在实践中发生的频率并不像我理论上想象的那么频繁。
整个设计成为以严格控制的方式交互的有序子系统集合之一。
请注意,这不适用于更简单的情况,在这种情况下,使用更传统的线程池的工作线程就足够了(尽管我经常通过发送 IOC 消息将工作线程的结果注入回主系统)。 它也不用于线程关闭并执行完全独立于系统其余部分(例如 HTTP 服务器线程)的操作的情况。 最后,它不用于资源协调器本身不与其他对象交互以及内部同步将完成工作而不存在死锁风险的情况。
编辑:我应该指出交换的消息通常应该是不可变的对象; 如果使用可变对象,则发送行为应被视为移交并导致发送者放弃所有控制,并且最好不保留对数据的引用。 就我个人而言,我使用可锁定的数据结构,该结构由 IOC 代码锁定,因此在发送时变得不可变(锁定标志是易失性的)。
The core concerns as I saw them were (a) avoiding deadlocks and (b) exchanging data between threads. A lessor concern (but only slightly lessor) was avoiding bottlenecks. I had already encountered several problems with disparate out of sequence locking causing deadlocks - it's very well to say "always acquire locks in the same order", but in a medium to large system it is practically speaking often impossible to ensure this.
Caveat: When I came up with this solution I had to target Java 1.1 (so the concurrency package was not yet a twinkle in Doug Lea's eye) - the tools at hand were entirely synchronized and wait/notify. I drew on experience writing a complex multi-process communications system using the real-time message based system QNX.
Based on my experience with QNX which had the deadlock concern, but avoided data-concurrency by coping messages from one process's memory space to anothers, I came up with a message-based approach for objects - which I called IOC, for inter-object coordination. At the inception I envisaged I might create all my objects like this, but in hindsight it turns out that they are only necessary at the major control points in a large application - the "interstate interchanges", if you will, not appropriate for every single "intersection" in the road system. That turns out to be a major benefit because they are quite un-POJO.
I envisaged a system where objects would not conceptually invoke synchronized methods, but instead would "send messages". Messages could be send/reply, where the sender waits while the message is processed and returns with the reply, or asynchronous where the message is dropped on a queue and dequeued and processed at a later stage. Note that this is a conceptual distinction - the messaging was implemented using synchronized method calls.
The core objects for the messaging system are an IsolatedObject, an IocBinding and an IocTarget.
The IsolatedObject is so called because it has no public methods; it is this that is extended in order to receive and process messages. Using reflection it is further enforced that child object has no public methods, nor any package or protected methods except those inherited from IsolatedObject nearly all of which are final; it looks very strange at first because when you subclass IsolatedObject, you create an object with 1 protected method:
and all the rest of the methods are private methods to handle specific messages.
The IocTarget is a means of abstracting visibility of an IsolatedObject and is very useful for giving another object a self-reference for sending signals back to you, without exposing your actual object reference.
And the IocBinding simply binds a sender object to a message receiver so that validation checks are not incurred for every message sent, and is created using an IocTarget.
All interaction with the isolated objects is through "sending" it messages - the receiver's processIocMessage method is synchronized which ensures that only one message is be handled at a time.
Having created a situation where all work done by the isolated object is funneled through a single method, I next arranged the objects in a declared hierarchy by means of a "classification" they declare when constructed - simply a string that identifies them as being one of any number of "types of message receiver", which places the object within some predetermined hierarchy. Then I used the message delivery code to ensure that if the sender was itself an IsolatedObject that for synchronous send/reply messages it was one which is lower on the hierarchy. Asynchronous messages (signals) are dispatched to message receivers using separate threads in a thread pool who's entire job deliver signals, therefore signals can be send from any object to any receiver in the system. Signals can can deliver any message data desired, but not reply is possible.
Because messages can only be delivered in an upward direction (and signals are always upward because they are delivered by a separate thread running solely for that purpose) deadlocks are eliminated by design.
Because interactions between threads are accomplished by exchanging messages using Java synchronization, race conditions and issues of stale data are likewise eliminated by design.
Because any given receiver handles only one message at a time, and because it has no other entry points, all considerations of object state are eliminated - effectively, the object is fully synchronized and synchronization cannot accidentally be left off any method; no getters returning stale cached thread data and no setters changing object state while another method is acting on it.
Because only the interactions between major components is funneled through this mechanism, in practice this has scaled very well - those interactions don't happen nearly as often in practice as I theorized.
The entire design becomes one of an orderly collection of subsystems interacting in a tightly controlled manner.
Note this is not used for simpler situations where worker threads using more conventional thread pools will suffice (though I will often inject the worker's results back into the main system by sending an IOC message). Nor is it used for situations where a thread goes off and does something completely independent of the rest of the system such as an HTTP server thread. Lastly, it is not used for situations where there is a resource coordinator that itself does not interact with other objects and where internal synchronization will do the job without risk of deadlock.
EDIT: I should have stated that the messages exchanged should generally be immutable objects; if using mutable objects the act of sending it should be considered a hand over and cause the sender to relinquish all control, and preferably retain no references to the data. Personally, I use a lockable data structure which is locked by the IOC code and therefore becomes immutable on sending (the lock flag is volatile).