混沌系统中的 MPI 并行性
我有一个 Fortran 动力学程序(基本上是一个 verlet 算法)。为了更快地计算速度,我将算法与 MPI 并行化。让我紧张的是,如果我有四个处理器,每个处理器都运行一个 Verlet,当它们达到并行化点时,它们会共享信息。然而,由于细微的数值差异(例如,在每个节点上编译的 LAPACK 中),从长远来看,每个 Verlet 轨迹可能会向完全不同的方向演化,这意味着在共享时我将获得来自不同节点的信息的混合。轨迹。因此,我决定在每个时间步骤同步信息以防止发散,但这显然引入了障碍。
这个问题(节点的分歧)通常是如何解决的?有参考资料吗?
I have a Fortran program for dynamics (basically a verlet algo). In order to compute the velocities faster I parallelized the algorithm with MPI. What makes me nervous is that if I have four processors, each processor runs a Verlet, and when they reach a point of parallelization, they share info. However, due to slight numerical differences (for example, in the compiled LAPACK on each node) each Verlet trajectory may evolve in a completely different direction in the long run, meaning that at the points of sharing I will obtained a mixup of info from different trajectories. I therefore decided to synchronize the info at every time step to prevent divergence, but this clearly introduces a barrier.
How is this problem (divergence of the nodes) normally solved ? Any references ?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
好吧,您不应该在每个节点上有不同的 LAPACK 编译。如果您的数值库在模拟的不同部分发生变化,您应该会收到奇怪的结果 - 这与并行性无关。所以不要这样做。
我唯一实时看到 MPI 在这种情况下引入棘手之处是,执行 MPI_REDUCE(...MPI_SUM...) 之类的操作可能会导致不同运行中相同数量的节点上得到不同的答案,因为求和可以是以不同的顺序。这只是标准的“浮点数学不能交换”的东西。您可以通过对相关数字执行 MPI_GATHER() 并按某种明确定义的顺序对它们进行求和来避免这种情况,例如在按大小从最低到最高排序之后。
Well, you shouldn't have different compiles of LAPACK on each node. If your numerical libraries change in different parts of the simulation, you should expect weird results -- and that has nothing to do with parallelism. So don't do that.
The only real time I've seen MPI introduce trickiness in situations like this is that doing things like MPI_REDUCE(...MPI_SUM...) can result in different answers on the same number of nodes on different runs, because the summation can be in a different order. That's just standard "floating-point math doesn't commute" stuff. You can avoid that by doing an MPI_GATHER() of the relevant numbers, and summing them in some well-defined order, such as after a sort lowest-to-highest in magnitude.