I do not understand what foreach is exporting when using the doMC backend.
Is there a way to control what is shipped to the workers ?
The following example seem to suggest that everything in the global environment is shipped, which is causing me trouble when I use 100+ Gb tables.
library("foreach")
library("doMC")
registerDoMC(3L)
x <- 1
y <- 2
foo <- function(x) x+1
cat(sprintf("hostname: %s, pid: %s, Objects are: %s\n", Sys.getenv("HOSTNAME"), Sys.getpid(), paste(ls(), collapse = ",")))
foreach(task = 1:3, .noexport = ls()) %dopar%
{
cat(sprintf("hostname: %s, pid: %s, Objects are: %s\n", Sys.getenv("HOSTNAME"), Sys.getpid(), paste(ls(), collapse = ",")))
foo(task)
} -> res
print(res)
Result: How is foo known to the workers ??
hostname: nyzls604m, pid: 372957, Objects are: foo,res,x,y
hostname: nyzls604m, pid: 385492, Objects are: task
hostname: nyzls604m, pid: 385493, Objects are: task
hostname: nyzls604m, pid: 385494, Objects are: task
[[1]]
[1] 2
[[2]]
[1] 3
[[3]]
[1] 4