JVM(尴尬地)并行处理库/工具
我正在寻找一些可以轻松在集群上运行(正确编码)并行 JVM 代码的东西(这样我就可以使用 Clojure + Incanter)。
我过去曾使用并行Python来做到这一点。我们有一个新的 PBS 集群,我们的管理员将很快设置使用 PBS 作为后端的 IPython 节点。这两个系统都使得在集群中运行某些类型的代码几乎是轻而易举的事。
我过去在使用 Hadoop 时犯了一个错误(Hadoop 不适合我使用的数据类型)——延迟使得即使是小规模的运行也要执行 1-2 分钟。
JPPF 还是 Gridgain 更能满足我的需求?这里有人有这方面的经验吗?还有什么可以推荐的吗?
I am looking for something that will make it easy to run (correctly coded) embarrassingly parallel JVM code on a cluster (so that I can use Clojure + Incanter).
I have used Parallel Python in the past to do this. We have a new PBS cluster and our admin will soon set up IPython nodes that use PBS as the backend. Both of these systems make it almost a no-brainer to run certain types of code in a cluster.
I made the mistake of using Hadoop in the past (Hadoop is just not suited to the kind of data that I use) - the latency made even small runs execute for 1-2 minutes.
Is JPPF or Gridgain better for what I need? Does anyone here have any experience with either? Is there anything else you can recommend?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
查看 cascalog - http://github.com/nathanmarz/cascalog
Check out cascalog - http://github.com/nathanmarz/cascalog
据报道,Clojure 可在 Terracotta 上运行,但需要进行一些修补。
Clojure is reported to work on Terracotta, subject to some patching.
查看 Skandium
编辑:
以上链接不再有效,因此添加 github 链接
https://github.com/mleyton/Skandium
Look at Skandium
Edit :
Above link is no more live, so adding github link
https://github.com/mleyton/Skandium
我建议您查看 Skandium,可以根据要求与开发人员协商 GPL 的替代许可证。
I suggest you look at Skandium, alternative licenses to GPL can be negotiated with the developers upon request.