Is there a way to limit the number of R processes running

4.2k Views Asked by At

I use the doMC that uses the package multicore. It happened (several times) that when I was debugging (in the console) it went sideways and fork-bombed.

Does R have the setrlimit() syscall? In pyhton for this i would use resource.RLIMIT_NPROC

Ideally I'd like to restrict the number of R processes running to a number

EDIT: OS is linux CentOS 6

2

There are 2 best solutions below

0
Dirk is no longer here On BEST ANSWER

There should be several choices. Here is the relevant section from Writing R Extensions, Section 1.2.1.1

   Packages are not standard-alone programs, and an R process could
contain more than one OpenMP-enabled package as well as other components
(for example, an optimized BLAS) making use of OpenMP. So careful
consideration needs to be given to resource usage.  OpenMP works with
parallel regions, and for most implementations the default is to use as
many threads as 'CPUs' for such regions.  Parallel regions can be
nested, although it is common to use only a single thread below the
first level.  The correctness of the detected number of 'CPUs' and the
assumption that the R process is entitled to use them all are both
dubious assumptions.  The best way to limit resources is to limit the
overall number of threads available to OpenMP in the R process: this can
be done via environment variable 'OMP_THREAD_LIMIT', where
implemented.(4)  Alternatively, the number of threads per region can be
limited by the environment variable 'OMP_NUM_THREADS' or API call
'omp_set_num_threads', or, better, for the regions in your code as part
of their specification.  E.g. R uses
     #pragma omp parallel for num_threads(nthreads) ...
That way you only control your own code and not that of other OpenMP
users.

One of my favourite tools is a package controlling this: RhpcBLASctl. Here is its Description:

Control the number of threads on 'BLAS' (Aka 'GotoBLAS', 'ACML' and 'MKL'). and possible to control the number of threads in 'OpenMP'. get a number of logical cores and physical cores if feasible.

After all you need to control the number of parallel session as well as the number of BLAS cores allocated to each of the parallel threads. There is a reason the parallel package has a default of 2 threads per session...

All of this should be largely independent of the flavour of Linux or Unix you are running. Well, apart from the fact that OS X of course (still !!) does not give you OpenMP.

And the very outer level you can control from doMC and friends.

0
damienfrancois On

You can use registerDoMC (see the doc here)

registerDoMC(cores=<some number>)

Another option is to use the ulimit command before running the R script:

ulimit -u <some number>

to limit the number of processes R will be able to spawn.

If you want to limit the total number of CPUs several R processes use at the same time, you will need to use cgroups or cpusets and attach the R processes to the cgroup or cpuset. They will then be confined to the physical CPUS defined in the cgroup or cpuset. cgroups allow more control (for instance also memory) but are more complex to setup.