do.call() function within %do.par%, function arguments not found

53 Views Asked by At

I am implementing a Monte Carlo simulation to test several methods (many of them).

The methods are implementented in a methods.R script. To illustrate, let's say that I have only implemented 2 methods. The methods.R looks like

method1 <- function(data){
  Some computations, e.g.,:
  N = nrow(data)
  return(results)
}

method2 <- function(data){
  Some computations, e.g.,:
  N = nrow(data)
  return(results)
}

The methods take quite a long time to run, so I am trying to parallelize the process with foreach and doParallel packages. The main script looks like:

library(foreach)
library(doParallel)

methods<-c('method1', 'method2')

cl<-makeCluster(4)
registerDoParallel(cl)

clusterEvalQ(cl, source('methods.R'))

foreach(sim=seq(100)) %dopar% {
  data=*Data generation*

  results<-vector(mode='list', length=0)

  for (method in methods){
    
    RES <- do.call(method, list(data=data))

    results[[method]] <- RES
  }
}

The above code, however, results in the Error: task 1 failed - "error in evaluating the argument 'x' in selecting a method for function 'nrow': object 'data' not found"

I have checked the documentation for the do.call function, which seems to evaluate the call in the GlobalEnvironment by default. However 'data' may not defined in the GlobalEnvironment but is some cluster-specific environment.

I have tried to search information about how to access these environments, but I have found no answer. Perhaps I am on the wrong track. Does anyone have a solution to this problem?

1

There are 1 best solutions below

0
Jinjin On

foreach function is a little wired when accessing variables. Sometimes, you need to export your variable (i.e., data) to the foreach. Try below:

foreach(..., .export=c('data'))%dopar% {...}