我需要构建一个重型分子动力学模拟器。我想知道 python+numpy 是否是一个不错的选择。这将在生产中使用,所以我想从一门好的语言开始。我想知道我是否应该从像eg.scala 这样的函数式语言开始。我们有足够的库支持 Scala 中的科学计算吗?或者您认为好的任何其他语言/范式组合 - 以及为什么。如果您过去确实构建过某些东西并且正在谈论经验,请提及它,因为它将帮助我收集数据点。
非常感谢!
I need to build a heavy duty molecular dynamics simulator. I am wondering if python+numpy is a good choice. This will be used in production, so I wanted to start with a good language. I am wondering if I should rather start with a functional language like eg.scala. Do we have enough library support for scientific computation in scala? Or any other language/paradigm combination you think is good - and why. If you had actually built something in the past and are talking from experience, please mention it as it will help me with collecting data points.
thanks much!
发布评论
评论(4)
高性能 MD 实现往往是绝对必要的(而不是功能性的),大数据数组胜过面向对象的设计。我曾使用过 LAMMPS,虽然它有不足之处,但它确实能完成工作。一个可能更有吸引力的选择是 HOOMD,它从一开始就针对具有 CUDA 的 Nvidia GPU 进行了优化。 HOOMD 不具备 LAMMPS 的所有功能,但界面看起来更好一些(可以通过 Python 编写脚本)并且性能非常高。
实际上,我已经使用高级面向对象设计实现了我自己的 MD 代码几次(Java 和 Scala),并且发现与经过大量调整并使用 C++/CUDA 的流行 MD 实现相比,性能令人失望。如今,似乎很少有科学家编写自己的 MD 实现,但能够修改现有的实现是很有用的。
The high performing MD implementations tend to be decidedly imperative (as opposed to functional) with big arrays of data trumping object-oriented design. I've worked with LAMMPS, and while it has its warts, it does get the job done. A perhaps more appealing option is HOOMD, which has been optimized from the beginning for Nvidia GPUs with CUDA. HOOMD doesn't have all the features of LAMMPS, but the interface seems a bit nicer (it's scriptable from Python) and it's very high performance.
I've actually implemented my own MD code a couple times (Java and Scala) using a high level object oriented design, and have found disappointing performance compared to the popular MD implementations that are heavily tuned and use C++/CUDA. These days, it seems few scientists write their own MD implementations, but it is useful to be able to modify existing ones.
我相信大多数高性能的 MD 代码都是用 Fortran、C 或 C++ 等本地语言编写的。现代 GPU 编程技术最近也受到青睐。
像 Python 这样的语言可以比本机代码更快地开发。另一方面是性能通常比编译的本机代码差。
有个问题要问你。你为什么要编写自己的 MD 代码?那里有很多图书馆。您找不到一款适合您的需求吗?
I believe that most highly performant MD codes are written in native languages like Fortran, C or C++. Modern GPU programming techniques are also finding favour more recently.
A language like Python would allow for much more rapid development that native code. The flip side of that is that the performance is typically worse than for compiled native code.
A question for you. Why are you writing your own MD code? There are many many libraries out there. Can't you find one to suit your needs?
如果您想使用 Python,另一种选择是查看 OpenMM:
https://simtk.org/home/openmm
它是一个分子动力学 API,具有您需要的许多基本元素(积分器、恒温器、恒压器等),并支持通过 OpenCL 在 CPU 上运行,通过 CUDA 和 OpenCL 在 GPU 上运行。它有一个我之前使用过的 python 包装器,基本上模仿了底层的 c-api 调用。它已被纳入 Gromacs 和 MDLab,所以如果您真的决心从(半)从头开始构建一些东西,那么您有一些如何集成它的示例
但是正如其他人所说,我强烈建议您看看 NAMD , 格罗马克, HOOMD、LAMMPS、DL_POLY 等,在您开始重新发明轮子之前看看它是否符合您的需求。
Another alternative if you want to use Python is to take a look at OpenMM:
https://simtk.org/home/openmm
It's a Molecular Dynamics API that has many of the basic elements that you need (integrators, thermostats, barostats, etc) and supports running on the CPU via OpenCL and GPU via CUDA and OpenCL. It has a python wrapper that I've used before and basically mimics the underlying c-api calls. It's been incorporated into Gromacs, and MDLab, so you have some examples of how to integrate it if you're really dead set on building something from (semi) scratch
However as others have said, I highly recommend taking a look at NAMD, Gromacs, HOOMD, LAMMPS, DL_POLY, etc to see if it fits your needs before you embark on re-inventing the wheel.