高效的收敛检查
我有一个包含数千个双精度实数的网格。
它正在迭代,我需要它在收敛到小数点后三位时停止。
目标是让它尽可能快地运行,但需要每次(到 3 dp)给出相同的结果。
目前我正在做这样的事情,
REAL(KIND=DP) :: TOL = 0.001_DP
DO WHILE(.NOT. CONVERGED)
CONVERGED = .TRUE.
DO I = 1, NUM_POINTS
NEW POTENTIAL = !blah blah blah
IF (CONVERGED) THEN
IF (NEW_POTENTIAL < OLD_POTENTIAL - TOL .OR. NEW_POTENTIAL > OLD_POTENTIAL + TOL) THEN
CONVERGED = .FALSE.
END IF
END IF
OLD_POTENTIAL = NEW POTENTIAL
END DO
END DO
我认为许多 IF 语句对于性能来说不会太大。我考虑过最后检查是否收敛;找到平均值(对整个网格求和,除以 num_points),并检查是否以与上面相同的方式收敛,但我不相信这总是准确的。
这样做的最佳方法是什么?
I have a grid with thousands of double precision reals.
It's iterating through, and I need it to stop when it's reached convergence to 3 decimal places.
The target is to have it run as fast as possible, but needs to give the same result every (to 3 dp) every time.
At the minute I'm doing something like this
REAL(KIND=DP) :: TOL = 0.001_DP
DO WHILE(.NOT. CONVERGED)
CONVERGED = .TRUE.
DO I = 1, NUM_POINTS
NEW POTENTIAL = !blah blah blah
IF (CONVERGED) THEN
IF (NEW_POTENTIAL < OLD_POTENTIAL - TOL .OR. NEW_POTENTIAL > OLD_POTENTIAL + TOL) THEN
CONVERGED = .FALSE.
END IF
END IF
OLD_POTENTIAL = NEW POTENTIAL
END DO
END DO
I'm thinking that many IF statements can't be too great for performance. I thought about checking for convergence at the end; finding the average value (summing the whole grid, divide by num_points), and checking if that has converged in the same way as above, but I'm not convinced this will always be accurate.
What is the best way of doing this?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
如果我理解正确的话,您会进行某种时间步进,通过计算
old_pottial
来创建new_pottial
中的值。然后使旧的等于新的并继续。您可以用单个语句替换现有的收敛测试
,这可能会更快。如果测试的速度是一个主要问题,您可以仅测试每隔一次(或每第三次或第四次......)迭代
一些评论:
1)如果您使用具有 2 个平面的势阵列,而不是 old_ 和 new_pottial,您可以通过在每次迭代结束时交换索引来将 new_ 转移到 old_ 。正如您的代码所示,正在进行大量数据移动。
2)虽然从语义上讲,使用 while 循环是正确的,但我总是使用具有最大迭代次数的 do 循环,以防万一永远无法满足收敛标准。
3) 在您的声明中
REAL(KIND=DP) :: TOL = 0.001_DP
DP 对 TOL 数值的说明是多余的,REAL(KIND=DP) :: TOL = 0.001
就足够了。我还将它作为一个参数,如果编译器知道它是不可变的,它可能能够优化它的使用。4) 您实际上并不需要在最外层循环内执行 CONVERGED = .TRUE.,只需在第一次迭代之前设置它 - 这将为您节省一两纳秒。
最后,如果您的收敛标准是潜在数组中的每个元素都收敛到 3dp,那么这就是您应该测试的。为您建议的平均值构建反例相对容易。但是,我担心您的系统永远不会在每个元素上收敛,并且您应该使用一些矩阵范数计算来确定收敛。 SO 不是该主题课程的地方。
If I understand correctly you've got some kind of time-stepping going on, where you create the values in
new_potential
by calculations onold_potential
. Then make old equal to new and carry on.You could replace your existing convergence tests with the single statement
which might be faster. If the speed of the test is a major concern you could test only every other (or every third or fourth ...) iteration
A few comments:
1) If you used a potential array with 2 planes, instead of an old_ and new_potential, you could transfer new_ into old_ by swapping indices at the end of each iteration. As your code stands there's a lot of data movement going on.
2) While semantically you are right to have a while loop, I'd always use a do loop with a maximum number of iterations, just in case the convergence criterion is never met.
3) In your declaration
REAL(KIND=DP) :: TOL = 0.001_DP
the specification of DP on the numerical value of TOL is redundant,REAL(KIND=DP) :: TOL = 0.001
is adequate. I'd also make this a parameter, the compiler may be able to optimise its use if it knows that it is immutable.4) You don't really need to execute
CONVERGED = .TRUE.
inside the outermost loop, set it before the first iteration -- this will save you a nanosecond or two.Finally, if your convergence criterion is that every element in the potential array has converged to 3dp then that is what you should test for. It would be relatively easy to construct counterexamples for your suggested averages. However, my concern would be that your system will never converge on every element and that you should be using some matrix norm computation to determine convergence. SO is not the place for a lesson in that topic.
收敛标准的计算是什么?除非它们比提高潜力的计算更糟糕,否则最好使用 IF 语句尽快终止循环,而不是猜测大量迭代以确保获得良好的解决方案。
关于高性能马克的建议#1,如果复制操作占运行时间的很大一部分,您也可以使用指针。
确定这些东西的唯一方法是测量运行时间...Fortran 提供了内部函数来测量 CPU 和时钟时间。否则,您可能会修改代码的某些部分以使其更快,也许会使其不太容易理解,并可能引入错误,可能在运行时没有太大改进......如果该部分只占总运行时的一小部分,再多的聪明也不会产生多大的影响。
正如 High Performance Mark 所说,尽管当前的语义很优雅,但您可能希望防止无限循环。一种方法:
What are the calculations for the convergence criteria? Unless they are worse then the calculations to advance the potential it is probably better to have the IF statement to terminate the loop as soon as possible rather than guess a very large number of iterations to be sure to obtain a good solution.
Re High Performance Mark's suggestion #1, if the copying operation is a significant portion of the run time, you could also use pointers.
The only way to be sure about this stuff is to measure the run time ... Fortran provides intrinsic functions to measure both CPU and clock time. Otherwise you may modify your some portion of you code to make it faster, perhaps making it less easier to understand and possibly introducing a bug, possibly without much improvement in runtime ... if that portion was taking a small amount of the total runtime, no amount of cleverness will can make much difference.
As High Performance Mark says, though the current semantics are elegant, you probably want to guard against an infinite loop. One approach:
也许更好
Maybe better