Rmpi从站调用用户定义函数时出错
我写了一个Rmpi代码,其中我希望主人与奴隶平等地分担工作负担。因此,函数 work_by_master
对 work_by_slaves
执行 mpi.bcast.cmd
操作,而这两个函数在发送-接收之前都会调用 work_to_be_done_per_process
交换结果。
我不断收到错误:
Error in mpi.probe(source, tag, comm, status) : ignoring SIGPIPE signal
Calls: work_by_master -> mpi.recv.Robj -> mpi.probe -> .Call
我努力理解错误是什么,最后在投入大量时间后,间接意识到错误可能来自于从属设备无法以嵌套方式调用用户定义函数的事实。当我将 work_to_be_done_per_process
合并到 work_by_slaves
中并只让 master 调用 work_to_be_done_per_process
时,错误得到解决。
我还将函数 work_to_be_done_per_process
复制到 work_to_be_done_per_process_by_slaves
和 work_to_be_done_per_process_by_master
中,并让从站和主站分别调用它们。即使这样也没有解决问题。因此,只有我的上述结论似乎才是原因。
这是真的吗?有没有其他人也遇到过这个问题,从属设备无法从其内部调用用户定义的函数?有没有办法正确地做到这一点。
I have written an Rmpi code in which I wish the master to share the burden of work equally with the slaves. So function work_by_master
does mpi.bcast.cmd
to work_by_slaves
which both inturn call work_to_be_done_per_process
before doing send-receive to exchange the result.
I was constantly getting an error:
Error in mpi.probe(source, tag, comm, status) : ignoring SIGPIPE signal
Calls: work_by_master -> mpi.recv.Robj -> mpi.probe -> .Call
I struggled hard to understand what the error is and finally after investing a LOT of time, indirectly realized that perhaps the error comes from the fact that the slaves cannot call a user-defined function in a nested way. When I incorporated work_to_be_done_per_process
within work_by_slaves
and let only the master call work_to_be_done_per_process
, the error was resolved.
I also duplicated the function work_to_be_done_per_process
into work_to_be_done_per_process_by_slaves
and work_to_be_done_per_process_by_master
and let the slaves and the master call them respectively. Even this did not solve the problem. Hence only my above conclusion seems to be the reason.
Is it true? Has anyone else also faced this problem that slave cannot call a user-defined function from inside it? Is there a way to correctly do it.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
根据我在 R 中并行处理的经验,集群中使用的每个核心都有一个单独的 R 环境。所有这些环境都会像启动普通 R 会话一样进行初始化。因此,启动 R 会话时默认情况下未加载的任何使用的定义函数都不可用。将它们加载到工作节点应该可以解决这个问题。在 最近的博文,我展示了如何为 SNOW 集群执行此操作,也许对您有一些用处。
In my experience with parallel processing in R, each core that is used in the cluster get's a separate R environment. All those environments are initialized as you would start a normal R session. So any used defined functions which are not loaded by default when starting an R session, are not available. Loading them into the worker nodes should fix this problem. In a recent blog post, I showed how to do this for SNOW clusters, maybe it is of some use to you.