自重启 MathKernel - 在 Mathematica 中可能吗?
这个问题来自最近的问题“正确的方法来限制 Mathematica em> 内存使用?”
我想知道,是否可以以编程方式重新启动 MathKernel,保持当前前端进程连接到新的 MathKernel 进程并在新的 MathKernel 会话中评估某些代码?我的意思是“透明”重新启动,它允许用户继续使用前端,同时拥有新的 MathKernel 进程,并在其中评估/评估之前内核中的一些代码?
这个问题的动机是有一种方法可以在 MathKernel 占用太多内存时自动重新启动,并且不会中断计算。换句话说,计算应该在新的 MathKernel 进程中自动继续,而无需与用户交互(但保留用户与 Mathematica 交互的能力,就像原来一样)。关于应在新内核中评估哪些代码的详细信息当然是针对每个计算任务的。我正在寻找如何自动继续计算的通用解决方案。
This question comes from the recent question "Correct way to cap Mathematica memory use?"
I wonder, is it possible to programmatically restart MathKernel keeping the current FrontEnd process connected to new MathKernel process and evaluating some code in new MathKernel session? I mean a "transparent" restart which allows a user to continue working with the FrontEnd while having new fresh MathKernel process with some code from the previous kernel evaluated/evaluating in it?
The motivation for the question is to have a way to automatize restarting of MathKernel when it takes too much memory without breaking the computation. In other words, the computation should be automatically continued in new MathKernel process without interaction with the user (but keeping the ability for user to interact with the Mathematica as it was originally). The details on what code should be evaluated in new kernel are of course specific for each computational task. I am looking for a general solution how to automatically continue the computation.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
Stack Exchange Mathematica 聊天中的 评论,完全引用了:
昨天,Arnoud Buzing在 如果您有多个单元格,您可以将 Quit 单独放入一个单元格中并设置此选项:
然后,如果您在其上方和下方都有一个单元格并选择所有三个单元格并进行评估,则内核将退出,但前端评估队列将继续(并为最后一个单元重新启动内核)。
——阿努德·布津
From a comment by Arnoud Buzing yesterday, on Stack Exchange Mathematica chat, quoting entirely:
In a notebook, if you have multiple cells you can put Quit in a cell by itself and set this option:
Then if you have a cell above it and below it and select all three and evaluate, the kernel will Quit but the frontend evaluation queue will continue (and restart the kernel for the last cell).
-- Arnoud Buzing
以下方法运行一个内核以使用其自己的内核打开前端,然后关闭并重新打开前端,更新第二个内核。
该文件是 MathKernel 输入,C:\Temp\test4.m
演示笔记本,C:\Temp\run.nb 包含两个单元:
初始内核打开前端并运行第一个单元,然后退出前端-end,重新打开它并运行第二个单元格。
整个过程可以通过将 MathKernel 输入粘贴(一次性)到内核会话中来运行,也可以从批处理文件运行,例如 C:\Temp\RunTest2.bat
设置起来有点复杂,并且在当前形式中,它取决于知道关闭并重新启动第二个内核之前需要等待多长时间。
The following approach runs one kernel to open a front-end with its own kernel, which is then closed and reopened, renewing the second kernel.
This file is the MathKernel input, C:\Temp\test4.m
The demo notebook, C:\Temp\run.nb contains two cells:
The initial kernel opens a front-end and runs the first cell, then it quits the front-end, reopens it and runs the second cell.
The whole thing can be run either by pasting (in one go) the MathKernel input into a kernel session, or it can be run from a batch file, e.g. C:\Temp\RunTest2.bat
It's a little elaborate to set up, and in its current form it depends on knowing how long to wait before closing and restarting the second kernel.
也许并行计算机器可以用于此目的?这是一个说明这个想法的粗略设置:
这是一个过于复杂的设置,用于生成 1,000 个三元组数字的列表。
getTheJobDone
运行一个循环,该循环一直持续到结果列表包含所需数量的元素为止。循环的每次迭代都在子内核中进行评估。如果子内核评估失败,则重新启动子内核。否则,其返回值将添加到结果列表中。要尝试此操作,请评估:
要演示恢复机制,请打开“并行内核状态”窗口并不时终止子内核。每当子内核终止时,
getTheJobDone
就会感受到痛苦并打印 Ouch!。然而,整体工作仍在继续并返回最终结果。这里的错误处理非常粗糙,可能需要在实际应用程序中得到支持。另外,我还没有调查子内核中真正严重的错误情况(例如内存不足)是否会对主内核产生不利影响。如果是这样,那么如果 MemoryInUse[] 超过预定阈值,子内核可能会自杀。
更新 - 将主内核与子内核崩溃隔离
在使用这个框架时,我发现如果子内核崩溃,主内核和子内核之间共享变量的任何使用都会导致 Mathematica 不稳定。这包括使用如上所示的
DistributeDefinitions[resultSoFar]
,以及使用SetSharedVariable
的显式共享变量。为了解决这个问题,我通过文件传输了
resultSoFar
。这消除了两个内核之间的同步,最终结果是主内核仍然幸福地没有意识到子内核崩溃。它还具有在主内核崩溃时保留中间结果的良好副作用。当然,这也会使子内核调用速度慢一些。但如果每次对子内核的调用都执行大量工作,那么这可能不是问题。以下是修订后的定义:
Perhaps the parallel computation machinery could be used for this? Here is a crude set-up that illustrates the idea:
This is an over-elaborate setup to generate a list of 1,000 triples of numbers.
getTheJobDone
runs a loop that continues until the result list contains the desired number of elements. Each iteration of the loop is evaluated in a subkernel. If the subkernel evaluation fails, the subkernel is relaunched. Otherwise, its return value is added to the result list.To try this out, evaluate:
To demonstrate the recovery mechanism, open the Parallel Kernel Status window and kill the subkernel from time-to-time.
getTheJobDone
will feel the pain and print Ouch! whenever the subkernel dies. However, the overall job continues and the final result is returned.The error-handling here is very crude and would likely need to be bolstered in a real application. Also, I have not investigated whether really serious error conditions in the subkernels (like running out of memory) would have an adverse effect on the main kernel. If so, then perhaps subkernels could kill themselves if
MemoryInUse[]
exceeded a predetermined threshold.Update - Isolating the Main Kernel From Subkernel Crashes
While playing around with this framework, I discovered that any use of shared variables between the main kernel and subkernel rendered Mathematica unstable should the subkernel crash. This includes the use of
DistributeDefinitions[resultSoFar]
as shown above, and also explicit shared variables usingSetSharedVariable
.To work around this problem, I transmitted the
resultSoFar
through a file. This eliminated the synchronization between the two kernels with the net result that the main kernel remained blissfully unaware of a subkernel crash. It also had the nice side-effect of retaining the intermediate results in the event of a main kernel crash as well. Of course, it also makes the subkernel calls quite a bit slower. But that might not be a problem if each call to the subkernel performs a significant amount of work.Here are the revised definitions:
当我运行 CUDAFunction 进行长循环并且 CUDALink 内存不足时,我有类似的要求(此处类似:https://mathematica.stackexchange.com/questions/31412/cudalink-ran-out-of-available-memory)。即使使用最新的 Mathematica 10.4 版本,内存泄漏也没有任何改善。我在这里找到了一个解决方法,希望您会发现它很有用。这个想法是,您使用 bash 脚本多次调用 Mathematica 程序(以批处理模式运行),并从 bash 脚本传递参数。以下是详细说明和演示(适用于 Window 操作系统):
这是 test.m 文件的演示
该 mathematica 代码从命令行读取参数并使用它进行计算。
这是使用不同参数多次运行 test.m 的 bash 脚本 (script.sh)。
在 cygwin 终端中输入“chmod a+x script.sh”以启用脚本,然后您可以通过输入“./script.sh”来运行它。
I have a similar requirement when I run a CUDAFunction for a long loop and CUDALink ran out of memory (similar here: https://mathematica.stackexchange.com/questions/31412/cudalink-ran-out-of-available-memory). There's no improvement on the memory leak even with the latest Mathematica 10.4 version. I figure out a workaround here and hope that you may find it's useful. The idea is that you use a bash script to call a Mathematica program (run in batch mode) multiple times with passing parameters from the bash script. Here is the detail instruction and demo (This is for Window OS):
Here is a demo of the test.m file
This mathematica code read the parameter from a commandline and use it for calculation.
Here is the bash script (script.sh) to run test.m many times with different parameters.
In the cygwin terminal type "chmod a+x script.sh" to enable the script then you can run it by typing "./script.sh".
您可以使用
Exit[]
以编程方式终止内核。当您下次尝试计算表达式时,前端(笔记本)将自动启动新内核。保留“以前内核中的一些代码”将会更加困难。您必须决定要保留什么。如果您认为要保留所有内容,那么重新启动内核就没有意义。如果您知道要保存哪些定义,则可以在终止内核之前使用
DumpSave
将它们写入文件,然后使用<<
加载该文件进入新内核。另一方面,如果您知道哪些定义占用了太多内存,则可以使用
Unset
、Clear
、ClearAll
或Remove
以删除这些定义。如果您的内存所在,您还可以将 $HistoryLength 设置为小于 Infinity(默认值)的值。You can programmatically terminate the kernel using
Exit[]
. The front end (notebook) will automatically start a new kernel when you next try to evaluate an expression.Preserving "some code from the previous kernel" is going to be more difficult. You have to decide what you want to preserve. If you think you want to preserve everything, then there's no point in restarting the kernel. If you know what definitions you want to save, you can use
DumpSave
to write them to a file before terminating the kernel, and then use<<
to load that file into the new kernel.On the other hand, if you know what definitions are taking up too much memory, you can use
Unset
,Clear
,ClearAll
, orRemove
to remove those definitions. You can also set $HistoryLength to something smaller thanInfinity
(the default) if that's where your memory is going.听起来像是 CleanSlate 的工作。
来自:http://library.wolfram.com/infocenter/TechNotes/4718/
“CleanSlate,尝试尽一切可能将内核返回到最初加载 CleanSlate.m 包时的状态。”
Sounds like a job for CleanSlate.
From: http://library.wolfram.com/infocenter/TechNotes/4718/
"CleanSlate, tries to do everything possible to return the kernel to the state it was in when the CleanSlate.m package was initially loaded."