能否使用消息中间件代替MPI来协调分布式计算?
我所说的面向消息的中间件指的是诸如高级消息队列协议之类的技术。
显然,AMQP 与 MPI 不同,但我认为以主从方式运行的分布式内存计算可以使用 AMQP 轻松实现,让 AMQP 在从机完成工作时处理公平的工作分配,而不是管理队列明确地在master中工作。
AMQP 的额外好处(如果有数千台机器一起工作)是单台机器的死亡不会阻碍 MPI_Bcast 的计算进度,因为 AMQP 可以简单地使用扇出而不是MPI_Bcast
,这将不会阻碍整个计算的进程。
是否有 AMQP 用于分布式计算中的任务协调的示例?
更新: Gearman 提供了一种非常好的容错分布式计算方法。
By message-oriented middleware I am referring to technologies such as Advanced Message Queuing Protocol.
Obviously AMQP is a different beast than MPI, but I would think distributed-memory computations that operate in a master-slave manner could be trivially implemented using AMQP, letting AMQP handle equitable work distribution to slaves as they finish pieces instead of managing the queue of work explicitly in the master.
The added benefit of AMQP (if you had thousands of machines working together) would be that the death of a single machine wouldn't stall progress of the computation at MPI_Bcast
s, because AMQP could simply use a fanout instead of MPI_Bcast
and that would be non-blocking to the progress of the overall computation.
Are there any examples of AMQP being used for task coordination in distributed computation?
Update: Gearman provides a really nice approach to fault tolerant distributed computation.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我认为区分分布式计算和并行计算很有帮助。我认为并行计算是分布式计算的一个子类。在分布式计算中,使用许多处理器来解决问题,并且问题可以分解为各种任务(例如,客户端-服务器,举一个非常简单的例子),并且处理器可以运行各种代码。
然而,在并行计算中,每个处理器可能运行相同的代码,但处理数据的不同部分。
现在,分布式计算结束和并行开始并没有硬性规定,但如果你看看频谱的两端,就会发现一些典型的例子具有非常不同的特征。我想谷歌可能会展示分布式计算的典型例子,而大型超级计算机中心运行的科学模拟则提供并行计算的典型例子。
上述所有内容只是我回答您的问题的背景:
是的,您当然可以使用 AMQP 来处理并行计算,是的,您可以使用 MPI 来实现分布式计算,但我认为您会在设计协议的功能上遇到困难。光谱的两端。
不,我不知道有人使用 AMQP 来进行我所说的并行计算。
I think it's helpful to distinguish between distributed computation and parallel computation. I take the view that parallel computation is a sub-class of distributed computation. In distributed computing many processors are used to tackle a problem, and the problem may be decomposed into a variety of tasks (eg client-server, to give a very simple example) and processors may be running a variety of codes.
In parallel computation, however, each processor is likely to be running the same code but getting a different part of the data to process.
Now, there is no hard and fast line where distributed computation ends and parallel begins, but if you look at the two ends of the spectrum there are canonical examples which have very different characteristics. I suppose Google might demonstrate a canonical example of distributed computation, while the kinds of scientific simulations that the large supercomputer centres run provide a canonical example of parallel computation.
All of the foregoing is simply background to my answer to your question:
Yes you could certainly use AMQP to tackle parallel computations and yes you could use MPI to implement distributed computations but I think you would be struggling against features of the protocols which are designed for the opposite ends of the spectrum.
And no, I don't know of anyone using AMQP for doing what I call parallel computation.