doMC、doSNOW、doSMP 与 doMPI:为什么“foreach”的各种并行后端不可用?功能等价?
我已经在不同的机器上运行了一些测试代码,总是得到相同的结果。我认为各种 do... 包背后的理念是它们可以互换用作 foreach 的 %dopar% 的后端。为什么情况并非如此?
例如,此代码片段有效:
library(plyr)
library(doMC)
registerDoMC()
x <- data.frame(V= c("X", "Y", "X", "Y", "Z" ), Z = 1:5)
ddply(x, .(V), function(df) sum(df$Z),.parallel=TRUE)
虽然每个代码片段都失败:
library(plyr)
library(doSMP)
workers <- startWorkers(2)
registerDoSMP(workers)
x <- data.frame(V= c("X", "Y", "X", "Y", "Z" ), Z = 1:5)
ddply(x, .(V), function(df) sum(df$Z),.parallel=TRUE)
stopWorkers(workers)
library(plyr)
library(snow)
library(doSNOW)
cl <- makeCluster(2, type = "SOCK")
registerDoSNOW(cl)
x <- data.frame(V= c("X", "Y", "X", "Y", "Z" ), Z = 1:5)
ddply(x, .(V), function(df) sum(df$Z),.parallel=TRUE)
stopCluster(cl)
library(plyr)
library(doMPI)
cl <- startMPIcluster(count = 2)
registerDoMPI(cl)
x <- data.frame(V= c("X", "Y", "X", "Y", "Z" ), Z = 1:5)
ddply(x, .(V), function(df) sum(df$Z),.parallel=TRUE)
closeCluster(cl)
在所有四种情况下,foreach(i = 1:3,.combine = "c") %dopar% {sqrt(i)} 产生完全相同的结果,所以我知道我已经安装了这些软件包并且在我测试过它们的每台机器上正常工作。
doMC 与 doSMP、doSNOW 和 doMPI 有何不同?
I've got a few test pieces of code that I've been running on various machines, always with the same results. I thought the philosophy behind the various do... packages was that they could be used interchangeably as a backend for foreach's %dopar%. Why is this not the case?
For example, this code snippet works:
library(plyr)
library(doMC)
registerDoMC()
x <- data.frame(V= c("X", "Y", "X", "Y", "Z" ), Z = 1:5)
ddply(x, .(V), function(df) sum(df$Z),.parallel=TRUE)
While each of these code snippets fail:
library(plyr)
library(doSMP)
workers <- startWorkers(2)
registerDoSMP(workers)
x <- data.frame(V= c("X", "Y", "X", "Y", "Z" ), Z = 1:5)
ddply(x, .(V), function(df) sum(df$Z),.parallel=TRUE)
stopWorkers(workers)
library(plyr)
library(snow)
library(doSNOW)
cl <- makeCluster(2, type = "SOCK")
registerDoSNOW(cl)
x <- data.frame(V= c("X", "Y", "X", "Y", "Z" ), Z = 1:5)
ddply(x, .(V), function(df) sum(df$Z),.parallel=TRUE)
stopCluster(cl)
library(plyr)
library(doMPI)
cl <- startMPIcluster(count = 2)
registerDoMPI(cl)
x <- data.frame(V= c("X", "Y", "X", "Y", "Z" ), Z = 1:5)
ddply(x, .(V), function(df) sum(df$Z),.parallel=TRUE)
closeCluster(cl)
In all four cases, foreach(i = 1:3,.combine = "c") %dopar% {sqrt(i)}
yields the exact same result, so I know I have the packages installed and working properly on each machine I've tested them on.
What is doMC doing differently from doSMP, doSNOW, and doMPI?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
doMC 分叉当前 R 进程,因此它继承所有现有变量。所有其他后端仅传递显式请求的变量。不幸的是我没有意识到这一点,并且只使用
doMC
进行了测试 - 这是我希望在下一个版本的 plyr 中修复的问题。doMC
forks the current R process so it inherits all the existing variables. All the other do backends only pass on explicitly requested variables. Unfortunately I didn't realise that, and only tested withdoMC
- this is something I hope to fix in the next version of plyr.