垂直和水平平行度
最近在并行领域工作,我了解到有两个术语“垂直并行”和“水平并行”。有人说openmp(共享内存并行)是垂直并行,而mpi(分布式内存并行)是水平并行。为什么这些术语这么称呼?我不明白原因。这么称呼它们只是术语吗?
Recently working in parallel domain i come to know that there are two terms "vertical parallelism " and "horizontal parallelism". Some people says openmp ( shared memory parallelism ) as vertical while mpi ( distributed memory parallelism ) as horizontal parallelism. Why these terms are called so ? I am not getting the reason. Is it just terminology to call them so ?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
这些术语似乎没有被广泛使用,也许是因为进程或系统经常不加区别地使用这两个术语。这些概念非常通用,涵盖的范围远远超出 MPI 或 openmp 的范围。
垂直并行性是系统同时使用多个不同设备的能力。例如,一个程序可能有一个线程执行繁重的计算,而另一个线程正在处理数据库查询,第三个线程正在执行 IO。大多数操作系统自然地暴露了这种能力。
当使用单个设备或对多个相似的数据项执行操作时,就会出现水平并行性。例如,当在同一段代码上运行多个线程但使用不同的数据时,就会发生这种并行性。
在软件世界中,一个有趣的例子实际上是 MapReduce 算法,它同时使用了以下两种算法:
水平并行发生在 Map 阶段,当数据被分割并分散到多个 cpu 上进行处理时,
类似地,在硬件世界中,超标量流水线CPU确实使用这两种变体,其中流水线是垂直并行化的特定实例(就像映射/减少分段一样,但多了几个步骤)。
使用该术语背后的原因可能与供应链中使用该术语的原因相同:价值是通过链接不同步骤或处理级别来产生的。最终产品可以被视为构造(从下到上)或依赖(从上到下)的抽象树的根,其中每个节点都是中间级别或步骤的结果。您可以在这里轻松看到供应链和计算之间的类比。
The terms don't seem to be widely used, perhaps because often time a process or system is using both without distinction. These concepts are very generic, covering much more than the realm of MPI or openmp.
Vertical parallelism is the faculty for a system to employ several different devices at the same time. For instance, a programme may have a thread doing heavy computation, while another is handling DB queries, and the third is doing IO. Most operating systems expose naturally this faculty.
Horizontal parallelism occurs when a single device is used or operation is executed on several similar items of data. This is the sort of parallelism that happen for instance when running several threads on the same piece of code, but with different data.
In the software world, an interesting example is actually the map reduce algorithm, which uses both:
horizontal parallelism occurs at the map stage, when data is split and scattered accross several cpu for processing,
vertical parallelism happens between the map and reduce stage, where data is first divided in chunks, then processed by the map threads, and accumulated by the reduce thread,
Similarily, in the hardware world, superscalar pipelined CPUs do use both variations, where pipelining is a particular instance of vertical parallelisation (just like the map/reduce staging, but with several more steps).
The reason behind the use of this terminology probably comes from the same reasons it is used with supply chains: values are produced by chaining different steps or levels of processing. The final product can be seen as the root of an abstract tree of constructions (from bottom to top) or dependency (from top to bottom) , where each node is the result of an intermediate level or step. You can easily see the analogy between supply chains and computation here.