什么是“作为消息处理”的 Erlang 设计模式?工作队列?
我正在尝试找出 erlang 演讲中提到的一种设计模式。 本质上,演讲者提到使用“消息作为进程”的工作队列,而不是使用作业作为进程。
关键思想是,通过使用“消息作为进程”,您可以节省序列化/反序列化开销。
谢谢
I'm trying to figure out a design pattern that was mentioned in an erlang talk.
Essentially the speaker mentions using a work queue using a "message as a process" rather then using the job as a process.
The key idea being that by using a "message as a process" you are able to save serialization/deserialization overhead.
Thanks
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
令
M
为Erlang term(),它是我们在系统中发送的消息。处理 M 的一种明显方法是构建进程和队列的管道。M
由管道中的第一个工作线程处理,然后发送到下一个队列。然后它被下一个工作进程拾取,再次处理并放入队列中。依此类推,直到消息被完全处理。也许不太明显的方法是定义一个进程
P
,然后将M
交给P
。我们将其标记为P(M)
。现在,消息本身是一个进程,而不是一段数据。P
将执行与工作人员在队列解决方案中所做的相同的工作,但它不必支付将M
放回队列并将其取出的开销再次等等。当处理P(M)
完成时,进程将简单地结束其生命。如果传递另一条消息M'
,我们将简单地生成P(M')
并让它同时处理该消息。如果我们得到一组进程,我们将执行[P(M) || M <- Set]
等等。如果
P
需要进行同步或消息传递,它可以这样做而不必“模拟”消息,因为它就是消息。与工作队列方法相比,工作队列方法必须对随之而来的消息负责。如果P
出现错误,则只有受错误影响的消息P(M)
才会崩溃。再次与工作队列方法形成对比,其中管道中的崩溃可能会影响其他消息(主要是如果管道设计不当)。所以结论是:将消息转变为一个成为消息的过程。
这个习语是“每条消息一个进程”,在 Erlang 中很常见。制作新工艺的价格和管理费用足够低,足以使其发挥作用。然而,如果您使用这个想法,您可能需要某种过载保护。原因是您可能希望限制并发请求的数量,以便您控制系统的负载,而不是盲目地让它破坏您的服务器。其中一种实现是由 Erlang Solutions 创建的 Jobs,请参阅
https://github.com/ esl/jobs
和 Ulf Wiger 的演示地址为:
http:// www.erlang-factory.com/conference/ErlangFactoryLiteLA/speakers/UlfWiger
正如 Ulf 在演讲中暗示的那样,我们通常会在
P
之外进行一些预处理来解析消息并将其内化到艾朗系统。但我们会尽快将消息M
包装在进程 (P(M)
) 中,将其变成作业。这样我们就可以立即享受到 Erlang Scheduler 的好处。这种选择还有另一个重要的后果:如果消息的处理需要很长时间,那么 Erlang 的抢占式调度程序将确保处理需求较少的消息仍能得到快速处理。如果工作队列的数量有限,则可能会导致其中许多队列被堵塞,从而阻碍系统的吞吐量。
Let
M
be an Erlang term() which is a message we send around in the system. One obvious way to handleM
is to build a pipeline of processes and queues.M
is processed by the first worker in the pipeline and then sent on to the next queue. It is then picked up by the next worker process, processed again and put into a queue. And so on until the message has been fully processed.The perhaps not-so-obvious way is to define a process
P
and then handM
toP
. We will notate it asP(M)
. Now the message itself is a process and not a piece of data.P
will do the same job that the workers did in the queue-solution but it won't have to pay the overhead of sticking theM
back into queues and pick it off again and so on. When the processingP(M)
is done, the process will simply end its life. If handed another messageM'
we will simply spawnP(M')
and let it handle that message concurrently. If we get a set of processes, we will do[P(M) || M <- Set]
and so on.If
P
needs to do synchronization or messaging, it can do so without having to "impersonate" the message, since it is the message. Contrast with the worker-queue approach where a worker has to take responsibility for a message that comes along it. IfP
has an error, only the messageP(M)
affected by the error will crash. Again, contrast with the worker-queue approach where a crash in the pipeline may affect other messages (mostly if the pipeline is badly designed).So the trick in conclusion: Turn a message into a process that becomes the message.
The idiom is 'One Process per Message' and is quite common in Erlang. The price and overhead of making a new process is low enough that this works. You may want some kind of overload protection should you use the idea however. The reason is that you probably want to put a limit to the amount of concurrent requests so you control the load of the system rather than blindly let it destroy your servers. One such implementation is Jobs, created by Erlang Solutions, see
https://github.com/esl/jobs
and Ulf Wiger is presenting it at:
http://www.erlang-factory.com/conference/ErlangFactoryLiteLA/speakers/UlfWiger
As Ulf hints in the talk, we will usually do some preprocessing outside
P
to parse the message and internalize it to the Erlang system. But as soon as possible we will make the messageM
into a job by wrapping it in a process (P(M)
). Thus we get the benefits of the Erlang Scheduler right away.There is another important ramification of this choice: If processing takes a long time for a message, then the preemptive scheduler of Erlang will ensure that messages with less processing needs still get handled quickly. If you have a limited amount of worker queues, you may end up with many of them being clogged, hampering the throughput of the system.