监控 Mathematica 中并行计算的进度
我正在构建一个大型ParallelTable
,并且希望对计算的进行情况保持一定的了解。对于非并行表,以下代码效果很好:
counter = 1;
Timing[
Monitor[
Table[
counter++
, {n, 10^6}];
, ProgressIndicator[counter, {0, 10^6}]
]
]
结果为 {0.943512, Null}
。然而,对于并行情况,有必要使计数器
在内核之间共享:
counter = 1;
SetSharedVariable[counter];
Timing[
Monitor[
ParallelTable[
counter++
, {n, 10^4}];
, ProgressIndicator[counter, {0, 10^4}]
]
]
结果为{6.33388, Null}
。由于每次更新时计数器的值都需要在内核之间来回传递,因此对性能的影响非常严重。关于如何了解计算的进展有什么想法吗?也许让每个内核都有自己的计数器值并每隔一段时间对它们求和?也许有某种方法可以确定表中的哪些元素已经外包给内核?
I'm building a large ParallelTable
, and would like to maintain some sense of how the computation is going. For a non parallel table the following code does a great job:
counter = 1;
Timing[
Monitor[
Table[
counter++
, {n, 10^6}];
, ProgressIndicator[counter, {0, 10^6}]
]
]
with the result {0.943512, Null}
. For the parallel case, however, it's necessary to make the counter
shared between the kernels:
counter = 1;
SetSharedVariable[counter];
Timing[
Monitor[
ParallelTable[
counter++
, {n, 10^4}];
, ProgressIndicator[counter, {0, 10^4}]
]
]
with the result {6.33388, Null}
. Since the value of counter
needs to be passed back and forth between the kernels at every update, the performance hit is beyond severe. Any ideas for how to get some sense of how the computation is going? Perhaps letting each kernel have its own value for counter
and summing them at intervals? Perhaps some way of determining what elements of the table have already been farmed out to the kernels?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
当您说“也许让每个内核都有自己的计数器值并每隔一段时间对它们求和?”时,您几乎自己给出了答案。
尝试这样的事情:
请注意,它比第一个单 CPU 情况花费的时间更长,只是因为它实际上在循环中执行了某些操作。
您可以更改测试 AbsoluteTime[] - last > 1 到更频繁的东西,例如 AbsoluteTime[] - last > > 0.1。
You nearly gave the answer yourself, when you said "Perhaps letting each kernel have its own value for counter and summing them at intervals?".
Try something like this:
Note that it takes longer than your first single-CPU case only because it actually does something in the loop.
You can change the test AbsoluteTime[] - last > 1 to something more frequent like AbsoluteTime[] - last > 0.1.
这似乎很难解决。来自手册:
但是,仍然可以使用旧的
Print
语句获取粗略的进度指示器:This seems hard to solve. From the manual:
However, a rough progress indicator can still be gotten using the old
Print
statement:另一种方法是对 LinkWrite 和 LinkRead 进行跟踪并修改它们的跟踪消息以进行一些有用的统计。
首先,启动一些并行内核:
这将为并行内核设置链接对象。
然后为链接读取和写入计数器定义一个 init 函数:
接下来,您希望在读取或写入链接时递增这些计数器:
这里,
x[[1,1]]
是有问题的 LinkObject。现在,打开 LinkWrite 和 LinkRead 上的跟踪:
要格式化进度显示,首先稍微缩短 LinkObject 显示,因为它们相当冗长:
这是一种动态显示子内核链接的读取和写入的方法:
(I' m 将计数除以二,因为每个链接的读取和写入都会被跟踪两次)。
最后使用 10,000 个元素表对其进行测试:
如果一切正常,您应该会看到最终进度显示,每个内核大约有 5,000 次读取和写入:
这会造成中等性能损失:没有显示器时为 10.73 秒,使用显示器时为 13.69 秒。当然,使用“FinestGrained”选项并不是用于此特定并行计算的最佳方法。
Another approach is to put a trace on LinkWrite and LinkRead and modify their tracing messages to do some useful accounting.
First, launch some parallel kernels:
This will have set up the link objects for the parallel kernels.
Then define an init function for link read and write counters:
Next, you want to increment these counters when their links are being read from or written to:
Here,
x[[1,1]]
is the LinkObject in question.Now, turn on tracing on LinkWrite and LinkRead:
To format the progress display, first shorten the LinkObject display a bit, since they are rather verbose:
And this is a way to display the reads and writes dynamically for the subkernel links:
(I'm dividing the counts by two, because every link read and write is traced twice).
And finally test it out with a 10,000 element table:
If everything worked, you should see a final progress display with about 5,000 read and writes for each kernel:
There is medium performance penalty for this: 10.73s without the monitor, and 13.69s with the monitor. And of course using the "FinestGrained" option is not the most optimal method to use for this particular parallel computation.
您可以从 Yuri Kandrashkin 开发的
Spin`System`LoopControl`
包中获得一些想法:Spin`
包的公告:You can get some ideas from the package
Spin`System`LoopControl`
developed by Yuri Kandrashkin:Announce of the
Spin`
package: