Cluster 和 MPP 超级计算机架构有什么区别?

发布于 2024-10-30 07:59:32 字数 34 浏览 0 评论 0原文

Cluster 和 MPP 超级计算机架构有什么区别?

What is the difference between a Cluster and MPP supercomputer architecture?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

錯遇了你 2024-11-06 07:59:32

在集群中,每台机器在内存、磁盘等方面很大程度上独立于其他机器。它们通过正常网络的一些变化进行互连。集群主要存在于程序员的脑海中以及他/她如何选择分配工作。

在大规模并行处理器中,实际上只有一台机器,其中数千个CPU紧密互连。 MPP 具有独特的内存架构,允许与相邻处理器进行极高速的中间结果交换。

主要变体是SIMD(单指令、多数据)和MIMD(多指令、多数据)。在 SIMD 系统中,每个处理器同时执行相同的指令,只是在内存的不同位上执行。本质上,只有一个程序计数器。在 MIMD 机器中,每个 CPU 都有自己的 PC。

MPP 的编程可能很麻烦,并且仅适用于令人尴尬的并行(这实际上就是他们所说的)算法。然而,如果您遇到这样的问题,那么 MPP 的速度可能会快得惊人。它们也非常昂贵。

In a cluster, each machine is largely independent of the others in terms of memory, disk, etc. They are interconnected using some variation on normal networking. The cluster exists mostly in the mind of the programmer and how s/he chooses to distribute the work.

In a Massively Parallel Processor, there really is only one machine with thousands of CPUs tightly interconnected. MPPs have exotic memory architectures to allow extremely high speed exchange of intermediate results with neighboring processors.

The major variants are SIMD (Single Instruction, Multiple Data) and MIMD (Multiple Instruction, Multiple Data). In a SIMD system, every processor is executing the same instruction at the same time, only on different bits of memory. Essentially, there is only one Program Counter. In a MIMD machine, each CPU has it's own PC.

MPPs can be a bitch to program and are of use only on algorithms that are embarrassingly parallel (that's actually what they call it). However, if you have such a problem, then an MPP can be shockingly fast. They are also incredibly expensive.

瑶笙 2024-11-06 07:59:32

top500 列表在 MPP 和集群之间使用了略有不同的区别,如 Dongarra 等人论文:

[集群是一个]并行计算机系统,由独立节点的集成集合组成,每个节点本身就是一个系统,能够独立运行,并且源自为其他独立目的而开发和销售的产品

与集群相比,现代 MPP(例如 IBM Blue Gene)集成得更加紧密:单个节点无法它们独立运行,并通过自定义网络(如多维环面)连接。但是,与集群类似,不存在跨越所有节点的单一共享内存(注意:MPP 可能是分层的,并且共享内存可能在单个节点 (NUMA) 内部或少数节点之间使用)。

因此,在这种情况下,我会非常小心地使用术语 SIMD 和 MIMD,因为它们通常描述共享内存架构 (SMP)。

更新:

Dongarra 等人 链接

更新:
MPP 可以有内部使用共享内存的节点;但整个 MPP 内存不共享。

The top500 list uses a slightly different distinction between an MPP and a cluster, as explained in Dongarra et al. paper:

[a cluster is a] parallel computer system comprising an integrated collection of independent nodes, each of which is a system in its own right, capable of independent operation and derived from products developed and marketed for other stand-alone purposes

Compared to a cluster, a modern MPP (such as the IBM Blue Gene) is more tightly-integrated: individual nodes cannot run on their own and they are connected by a custom network (like a multidimensional torus). But, similarly to a cluster, there is no single, shared memory spanning all the nodes (note: an MPP might be hierarchical and shared memory might be used inside a single node (NUMA), or between a handful of nodes).

I'd be thus extremely careful to use terms SIMD and MIMD in this context as they usually describe shared memory architectures (SMP).

Update:

Dongarra et al. link

Update:
MPP can have nodes that use shared memory internally; but the whole MPP memory is not shared.

疏忽 2024-11-06 07:59:32

集群是一群机器,通常是以太网互连(读:网络),每台机器都运行自己的独立操作系统副本,该操作系统恰好服务于单一目的。

MPP 超级计算机通常意味着更快的调解非常快速的互连(例如 SGI NUMALink),支持分布式共享内存(在不同的 MPP 节点上运行进程,这些节点使用快速互连上的共享内存来共享数据,就像它们在一台计算机上运行一样)甚至是单个系统映像(操作系统的单个实例,主要是 Linux,同时在所有节点上运行,就像在一台机器上一样 - 例如任何节点上的“ps aux”将显示运行在其上的所有进程议员)。

正如您所看到的,定义非常不稳定,这更多的是规模问题,而不是明确的差异。

A cluster is a bunch of machines, normally usually Ethernet interconnect (read: network), each running it's own and separate copy of an OS which happen to serve a single purpose.

An MPP supercomputer usually implies a faster propitiatory very fast interconnect (e.g. SGI NUMALink) that supports either Distributed Shared Memory (run processes on different MPP nodes that use shared memory over the fast interconnect to share data as if they were running on a single computer) or even a Single System Image (a single instance of an operating system, mostly Linux, running on all the nodes at the same time as if on a single machine - e.g. "ps aux" on any node will show you all the processes running on the MPP).

As you can see the definition is quite fluid, it's more a question of scale rather than clear cut differences.

抹茶夏天i‖ 2024-11-06 07:59:32

我查了很多HPC文献,没有找到MPP的具体定义。对于由多个互连的常规个人计算机或工作站组成的集群,通常与标准技术(如以太网或开源操作系统)相结合,存在相当大的让步。 MPP 一词通常应用于构建分布式内存计算机的更多专有方法,通常具有专有技术。

例如:Tianhe-2 被视为集群,因为它使用 x86-64 节点和常规操作系统(Kylin Linux)。 Sunway TaihuLight 被认为是 MPP,因为它的节点有其特定的架构 SW26010,并在其自己的操作系统 Sunway Raise OS 上运行。

我发现的关于这个问题的最具体的解释是在 Sourcebook of ParallelComputing (Dongarra et al. )

我们注意到,术语集群既可以广泛应用(任何使用大量商品组件构建的系统),也可以狭义应用(仅限商品组件和开源软件)。事实上,集群并没有精确的定义。用于争论系统是大规模并行处理器 (MPP) 而不是集群的一些问题包括专有互连 (...),特别是为特定用途设计的互连
并行计算机,以及将整个系统视为一台机器的特殊软件,特别是对于系统管理员而言。集群可以由个人计算机或工作站(单处理器或对称多处理器(SMP))构建,并且可以运行开源或专有操作系统。

I've searched in a lot of HPC literature and couldn't find a concrete definition of MPP. There is quite a concesus over a cluster consisting of multiple interconnected regular personal computers or workstations, usually coupled with standard technologies (like Ethernet or open-source operating systems). The term MPP is usually applied to more propietary approches for building distributed-memory computers, usually having propietary technologies.

For example: Tianhe-2 is considered a cluster because it uses x86-64 nodes and a regular operating system (Kylin Linux). Sunway TaihuLight is considered an MPP because its nodes have its particular architecture, SW26010, and work over his own operating system called Sunway Raise OS.

The most concrete explanation of this matter I found was in Sourcebook of Parallel Computing (Dongarra et al.):

We note that the term cluster can be applied both broadly (any system built with a significant number of commodity components) or narrowly (only commodity components and open-source software). In fact, there is no precise definition of a cluster. Some of the issues that are used to argue that a system is a massively parallel processor (MPP) instead of a cluster include proprietary interconnects (...), particularly ones designed for a specific
parallel computer, and special software that treats the entire system as a single machine, particularly for the system administrators. Clusters may be built from personal computers or workstations (either single processors or symmetric multiprocessors (SMPs)) and may run either open-source or proprietary operating systems.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文