通过 try() 之类的方法使 R 命令超时
我正在并行运行大量迭代。某些迭代比其他迭代花费的时间要长得多(例如 100 倍)。我想将这些超时,但我不想深入研究函数(称为 fun.c)背后的 C 代码来完成繁重的工作。我希望有类似于 try() 的东西,但有一个 time.out 选项。然后我可以做类似的事情:
for (i in 1:1000) {
try(fun.c(args),time.out=60))->to.return[i]
}
因此,如果 fun.c 对于某个迭代花费的时间超过 60 秒,那么修改后的 try() 函数将杀死它并返回警告或类似的内容。
有人有什么建议吗?提前致谢。
I'm running a large number of iterations in parallel. Certain iterates take much (say 100x) longer than others. I want to time these out, but I'd rather not have to dig into the C code behind the function (call it fun.c) doing the heavy lifting. I am hoping there is something similar to try() but with a time.out option. Then I could do something like:
for (i in 1:1000) {
try(fun.c(args),time.out=60))->to.return[i]
}
So if fun.c took longer than 60 seconds for a certain iterate, then the revamped try() function would just kill it and return a warning or something along those lines.
Anybody have any advice? Thanks in advance.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
请参阅此线程:http://r.789695.n4.nabble。 com/Time-out-for-aR-Function-td3075686.html
和
?evalWithTimeout
中R.utils
包。这是一个例子:
See this thread: http://r.789695.n4.nabble.com/Time-out-for-a-R-Function-td3075686.html
and
?evalWithTimeout
in theR.utils
package.Here's an example:
这听起来像是应该由向工作人员分发任务的任何东西来管理,而不是应该包含在工作线程中。
multicore
包支持某些函数的超时;据我所知,snow
没有。编辑:如果您真的非常渴望在工作线程中使用此功能,请尝试此功能,灵感来自@jthetzel 答案中的链接。
您可能需要自定义超时时的行为。目前它只返回
NULL
。This sounds like it should be something that should be managed by whatever is doling out tasks to the workers, rather than something that should be contained in a worker thread. The
multicore
package supports timeouts for some functions;snow
doesn't, as far as I can tell.EDIT: If you're really desperate to have this in the worker threads, then try this function, inspired by the links in @jthetzel's answer.
You'll perhaps want to customise the behaviour in the event of a timeout. At the moment it just returns
NULL
.我喜欢 R.utils::withTimeout(),但我也渴望尽可能避免包依赖。这是基本 R 中的解决方案。请注意
on.exit()
调用。即使您的表达式抛出错误,它也确保消除时间限制。I like
R.utils::withTimeout()
, but I also aspire to avoid package dependencies if I can. Here is a solution in base R. Please note theon.exit()
call. It makes sure to remove the time limit even if your expression throws an error.您在评论中提到您的问题是 C 代码运行时间过长。根据我的经验,基于
setTimeLimit
/evalWithTimeout
的纯粹基于 R 的超时解决方案都无法停止 C 代码的执行,除非代码提供了中断 R 的机会。在评论中还提到您正在对 SNOW 进行并行化。如果您要并行化的机器是支持分叉的操作系统(即不是 Windows),那么您可以在以下环境中使用 mcparallel(位于
parallel
包中,派生自multicore
): SNOW 集群上节点的命令上下文;顺便说一句,反之亦然,您可以从多核分叉的上下文中触发 SNOW 集群。如果您不通过 SNOW 进行并行化,那么这个答案(当然)也成立,前提是需要使 C 代码超时的机器可以分叉。这适合
eval_fork
,这是 opencpu 使用的解决方案。查看 eval_fork 函数主体的下面,了解 Windows 中的黑客攻击的概要以及该黑客攻击的实施不佳的一半版本。Windows 黑客:
原则上,特别是对于 SNOW 中的工作节点,您可以通过让工作节点完成类似的操作:
save.image
) 存储到已知位置Rscript
的系统调用,该脚本加载节点保存的工作区,然后保存结果(本质上是对 R 工作区进行慢速内存分叉)。我很久以前使用慢速内存副本在本地主机上的 Windows 上为类似 mcparallel 的东西编写了一些代码。我现在会完全不同地写它,但它可能会给你一个起点,所以我无论如何都会提供它。需要注意一些问题,
russmisc
是我正在编写的一个包,现在作为repsych
在 github 上。glibrary
是repsych
中的一个函数,如果软件包尚不可用,它会安装该软件包(如果您的 SNOW 不仅仅位于本地主机上,则可能很重要)。 ...当然,我已经很多年没有使用过这个代码了,而且我最近也没有测试过它 - 我共享的版本可能包含我在后续版本中解决的错误。You mentioned in a comment that your problem is with C code running long. In my experience, none of the purely R based timeout solutions based on
setTimeLimit
/evalWithTimeout
can stop the execution of C code unless the code provides an opportunity to interrupt to R.You also mentioned in a comment that you are parallelizing over SNOW. If the machines you are parallelizing to are an OS that supports forking (i.e., not Windows), then you can use mcparallel (in the
parallel
package, derived frommulticore
) within the context of a command to a node on a SNOW cluster; the inverse is also true BTW, you can trigger SNOW clusters from the context of amulticore
fork. This answer also (of course) holds if you aren't parallelizing via SNOW, provided the machine that needs to timeout the C code can fork.This lends itself to
eval_fork
, a solution used by opencpu. Look below the body of theeval_fork
function for an outline of a hack in Windows and a poorly implemented half version of that hack.Windows hack:
In principle, especially with worker nodes in SNOW, you could accomplish something similar by having the worker nodes:
save.image
) to a known locationRscript
with an R script that loads the workspace saved by the node and then saves a result (essentially doing a slow memory fork of the R workspace).I wrote some code a /long/ time ago for something like mcparallel on Windows on localhost using slow memory copies. I would write it completely differently now, but it might give you a place to start, so I'm providing it anyway. Some gotchas to note,
russmisc
was a package I'm writing which now is on github asrepsych
.glibrary
is a function inrepsych
that installs a package if it isn't already available (potentially important if your SNOW isn't just on localhost). ... and of course I haven't used this code for /years/, and I haven't tested it recently - it is possible the version I'm sharing contains errors that I resolved in later versions.