I am building a package for R which I want to be able to be cross-platform. I am developing under Linux, and the function mclapply will be used from the parallel package. This package is not supported for Windows (which uses doParallel). I really like the parallel package though for it's simplicity and speed, and I do not know if this should be a reason to have 2 different versions available of the package for CRAN, for the separate OS (seems like extra work to maintain), not to mention if it is even allowed.
Thoughts?
Also, for now I am regarding parallel's
mclapply(ldata, function(x), mc.cores=cores)
to be equivalent of doParallel's
cl <- makeCluster(cores)
parLapply(cl, ldata, function(x))
Is that correct?
First, both
mclapplyandparLapplyare in theparallelpackage, althoughmclapplydoesn't actually run in parallel on Windows.parLapplyruns in parallel on all supported platforms, but isn't always as efficient asmclapply. ThedoParallelpackage is used with theforeachpackage, and acts as an adapter to theparallelpackage.To write a package that works on both Windows and non-Windows, you have a variety of reasonable options:
parLapplysince it works everywhereparLapplyon Windows andmclapplyelsewheredoParallelwithforeachThe
doParallelpackage is convenient because it makes use ofmclapplyon non-Windows platforms. For example:This uses
mclapplyon Linux and Mac OS X, but will automatically create a PSOCK cluster object behind the scenes on Windows. The use ofpreschedule=TRUE(added indoParallel1.0.3) will causedoParallelto preschedule the tasks usingclusterApplyinternally, much likeparLapply.Note that if you explicitly create and register a cluster object, then
mclapplywill not be used, regardless of the platform. It will work fine, but may not be as efficient. To usemclapply, you must callregisterDoParallelwith a numeric argument, or no argument at all.You can look at the source code for the
bootpackage for an example of how to use eithermclapplyorparLapplydepending on your platform.