I'm having some trouble getting some nested foreach
loops to run in parallel. Here's the situation:
This program essentially performs hypothesis tests using different numbers of observations and different levels of statistical significance. I have four nested foreach loops. The item data.structures
is a list of matrices on which the tests are performed. There are two different lists I'm using for data.structures
. One list contains 243 matrices (the small list), and the other contains 19,683 (the large list).
number.observations = c(50,100,250,500,1000)
significance.levels = c(.001,.01,.05,.1,.15)
require(foreach)
require(doParallel)
cl = makeCluster(detectCores())
registerDoParallel(cl)
results = foreach(data=data.structures,.inorder=FALSE,.combine='rbind') %:%
foreach(iter=1:iterations,.inorder=FALSE,.combine='rbind') %:%
foreach(number.observations=observations,.inorder=FALSE,.combine='rbind') %:%
foreach(alpha=significance.levels,.inorder=FALSE,.combine='rbind') %dopar% {
#SOME FUNCTIONS HERE
}
When I use the small list of matrices for data.structures
, I can see all of the cores being fully utilized (100 percent CPU usage) in Windows' Resource Monitor with six threads each for eight processes, and the job completes as expected in a much shorter amount of time. When I change, however, to the larger list of matrices, the processes are initiated and can be seen in the Processes section of the Resource Monitor. Each of the eight processes shows three threads each with no CPU action. The total CPU usage is approximately 12 percent.
I'm new to parallelization with R. Even if I simplify the problem and the functions, I'm still only able to get the program to run in parallel with the small list. From my own reading, I'm wondering if this is an issue with workload distribution. I've included the .inorder = FALSE
option to try and work around this to no avail. I'm fairly certain that this program is a good candidate for parallelization because it performs the same task hundreds of thousands of times and the loops don't depend on previous values.
Any help is tremendously appreciated!
similar issues happened in my code too.
A very simple nested foreach parallel loops, it can be executed, but just not in parallel style.