MapReduce 在原始函数式语言中的可扩展性如何?
Map-Reduce 编程模型源于映射和化简函数,这些函数早在 Lisp 和Scheme 等函数式语言中就已存在。
我记得在大学(90 年代初)时,我就被告知 Map-Reduce 在可扩展性方面具有优势。
目前我们都知道 Hadoop 以及从 Google 复制的原始版本。 我想知道的是“旧”功能语言中存在哪些选项可以在至少几个计算节点上执行 Map-Reduce?
或者这是那些在纸面上看起来不错但在谷歌之前没有人真正构建的功能之一?
The Map-Reduce programming model stems from the map and reduce functions which are present in functional languages like Lisp and Scheme dating back many many years.
I remember from university (early 90's) that even back then I was told Map-Reduce had advantages in terms of scalability.
At the moment we all know about Hadoop and the original from Google it was copied from.
What I was wondering about is what options exist in "old" functional languages to do Map-Reduce over at least a few compute nodes?
Or is this one of those features that looked good on paper but no one ever got around to actually building until Google did it?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
Map/Reduce 是数据并行性的特例。
数据并行(不仅仅是
map
和fold
)广泛应用于高性能计算语言和并行函数语言中。谷歌和其他公司已经为其用例构建了高度优化(受限)的分布式编程模型,但他们肯定完全了解其他地方这项工作的起源和状态。HPC 语言,例如
和纯函数式语言,完全数据并行:
都支持分布式或多核系统的完整数据并行编程模型。特别是,Chapel、Fortress 和 X10 的目标是在世界上最大的计算机集群上实现大规模可扩展性。许多其他语言支持并行映射和折叠的一些概念(例如 Erlang、Clojure、Scala、F#)。
因此,谷歌确实普及了数据并行性,其基本形式是映射/归约,但这并不是故事的结局。
Map/Reduce is a special case of data parallelism.
Data parallelism (which is more than just
map
andfold
) is widely used in high performance computing languages, and in parallel functional languages. Google and others have built a highly optimized (restricted) distributed programming model for their use case, but they're surely fully aware of the origins and state of this work elsewhere.HPC languages, such as
and purely functional languages, with full data parallelism:
all support a full data parallel programming model, for either distributed or multicore systems. In particular, Chapel, Fortress and X10 are aimed at massive scalability on the world's largest computer clusters. Many other languages support some notion of parallel map and fold (e.g. Erlang, Clojure, Scala, F#).
So, certainly Google popularized data parallelism, in its basic form as map/reduce, but that's not the end of the story.