Spread vs MPI vs Zeromq?

发布于 2024-07-04 17:48:59 字数 450 浏览 9 评论 0原文

具有 TCP 可靠性的 UDP 广播的答案之一,用户提到了 Spread 消息传递 API。 我还遇到过一个名为 ØMQ 的项目。 我对 MPI 也有一定了解。

所以,我的主要问题是:为什么我会选择其中之一而不是另一个? 更具体地说,当有成熟的 MPI 实现时,为什么我会选择使用 Spread 或 ØMQ?

In one of the answers to Broadcast like UDP with the Reliability of TCP, a user mentions the Spread messaging API. I've also run across one called ØMQ. I also have some familiarity with MPI.

So, my main question is: why would I choose one over the other? More specifically, why would I choose to use Spread or ØMQ when there are mature implementations of MPI to be had?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

孤独患者 2024-07-11 17:48:59

我没有使用过这些库,但我也许可以给出一些提示。

  1. MPI是一种通信协议,而Spread和ØMQ是实际实现。
  2. MPI来自“并行”编程,而Spread来自“分布式”编程。

因此,这实际上取决于您是要构建并行系统还是分布式系统。 它们彼此相关,但隐含的内涵/目标不同。 并行编程通过同时使用多台计算机来提高计算能力。 分布式编程涉及可靠(一致、容错和高可用性)的计算机组。

“可靠性”的概念与TCP略有不同。 TCP的可靠性是“无论如何都要把这个数据包交给最终程序”。 分布式编程的可靠性是“即使某些机器死机,整个系统仍然以一致的方式工作”。 为了真正保证所有参与者都收到消息,需要诸如 两阶段提交 或更快的替代方案之一。

I have not used any of these libraries, but I may be able to give some hints.

  1. MPI is a communication protocol while Spread and ØMQ are actual implementation.
  2. MPI comes from "parallel" programming while Spread comes from "distributed" programming.

So, it really depends on whether you are trying to build a parallel system or distributed system. They are related to each other, but the implied connotations/goals are different. Parallel programming deals with increasing computational power by using multiple computers simultaneously. Distributed programming deals with reliable (consistent, fault-tolerant and highly available) group of computers.

The concept of "reliability" is slightly different from that of TCP. TCP's reliability is "give this packet to the end program no matter what." The distributed programming's reliability is "even if some machines die, the system as a whole continues to work in consistent manner." To really guarantee that all participants got the message, one would need something like 2 phase commit or one of faster alternatives.

抱着落日 2024-07-11 17:48:59

MPI 被设计为具有快速、可靠网络的紧密耦合计算集群。 Spread 和 ØMQ 是为大型分布式系统设计的。 如果您正在设计并行科学应用程序,请选择 MPI,但如果您正在设计需要对故障和网络不稳定具有弹性的持久分布式系统,请使用其他系统之一。

MPI 的容错能力非常有限; 大多数实现中的默认错误处理行为是系统范围的失败。 此外,MPI 的语义要求所有发送的消息最终都被消耗。 这对于集群上的模拟很有意义,但对于分布式应用程序则不然。

MPI was deisgned tightly-coupled compute clusters with fast, reliable networks. Spread and ØMQ are designed for large distributed systems. If you're designing a parallel scientific application, go with MPI, but if you are designing a persistent distributed system that needs to be resilient to faults and network instability, use one of the others.

MPI has very limited facilities for fault tolerance; the default error handling behavior in most implementations is a system-wide fail. Also, the semantics of MPI require that all messages sent eventually be consumed. This makes a lot of sense for simulations on a cluster, but not for a distributed application.

〗斷ホ乔殘χμё〖 2024-07-11 17:48:59

您在这里处理的是非常不同的 API,对于所提供的服务类型和每个 API 的基础设施有不同的概念。 我对 MPI 和 Spread 的了解不够,无法回答它们,但我可以使用 ZeroMQ 提供更多帮助。

ZeroMQ是一个简单的消息通信库。 它只是根据一组受限制的常见消息传递模式(PUSH/PULL、REQUEST/REPLY、PUB/SUB 等)向不同对等方(包括本地对等方)发送消息。 它严格根据这些模式处理客户端连接、检索和基本拥塞,您必须自己完成其余的工作。

尽管看起来非常受限,但这种简单的行为主要是应用程序通信层所需要的。 它允许您使用节点之间的简单代理和网关,从全部位于内存中的简单原型快速扩展到各种环境中更复杂的分布式应用程序。 但是,不要指望它能够进行节点部署、网络发现或服务器监控; 你必须自己做。

简而言之,如果您的应用程序想要从简单的多线程进程扩展到分布式可变环境,或者您想要快速进行实验和原型设计,并且似乎没有解决方案适合您的模型,请使用 Zeromq。 然而,如果您想扩展到非常大的集群,则需要在网络的部署和监控上付出一些努力。

You're addressing very different APIs here, with different notions about the kind of services provided and infrastructure for each of them. I don't know enough about MPI and Spread to answer for them, but I can help a little more with ZeroMQ.

ZeroMQ is a simple messaging communication library. It does nothing else than send a message to different peers (including local ones) based on a restricted set of common messaging patterns (PUSH/PULL, REQUEST/REPLY, PUB/SUB, etc.). It handles client connection, retrieval, and basic congestion strictly based on those patterns and you have to do the rest yourself.

Although appearing very restricted, this simple behavior is mostly what you would need for the communication layer of your application. It lets you scale very quickly from a simple prototype, all in memory, to more complex distributed applications in various environments, using simple proxies and gateways between nodes. However, don't expect it to do node deployment, network discovery, or server monitoring; You will have to do it yourself.

Briefly, use zeromq if you have an application that you want to scale from the simple multithread process to a distributed and variable environment, or that you want to experiment and prototype quickly and that no solutions seems to fit with your model. Expect however to have to put some effort on the deployment and monitoring of your network if you want to scale to a very large cluster.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文