I have an application that opens a file with mmap() and does stuff to it (long story short, makes calls to gdb to parse a coredump file and then 7z to compress the dump). What I am trying to achieve is setting a limit on how much resident memory (a.k.a. actual RAM) can be used by this application, while letting it use as much total virtual memory as it wants.
There are two main suggestions I've seen to achieve this: ulimit and cgroups.
mmap: an observation
Before moving forward, a note on mmap: my understanding is the whole point of using it is to minimize the total amount of memory used to read file. This works by having the mmap'ed file backed up by themselves, not by swap or RAM. However, when I start my application (that uses mmap) and look at the output from top, I notice it still reports the application as using a large amount of virtual memory... using just a bit under the size of the file that is being opened with mmap. So a 15GB file might report 0.5GB of RAM usage and 14.5GB of virtual memory usage. So does this mean mmap needs to load the entire file into (virtual) memory or is this just a quirk of the way Linux reports memory usage for mmap (as in, it "counts" the space on the hard drive where the file is located as virtual memory)?
ulimit
ulimit only supports setting a limit for virtual memory as a whole. There is no way to way to specify a limit for only resident memory, which is what I'm interested in. Since mmap appears to use roughly the same amount of virtual memory as the size of the file it is opening (as described above), this doesn't work for me. Set ulimit -v to any thing less, and my application crashes.
cgroups
cgroups lets us set a specific limit for resident memory with memory.limit_in_bytes. I tried creating a cgroup and running my application with it. Here I saw a phenomenon that's left me stumped: on a machine with only 4GB of RAM and 2 CPUS, the cgroup seems to respect the RAM usage limit I set, with the limit_in_bytes only set to 100MB. However on a machine with 500GB, 60 CPUs and a limit of 100 bytes, the exact same file, exact same application (same executable, not rebuilt on the new machine or anything), setting the same 100MB limit leads to the application crashing. Only when I set the limit back to around the same size as the file being mmapd, can it run successfully.
So there are a two questions here:
Does
mmapneed to load the whole file into virtual memory to work or not? My evidence points to yes after tryingulimit... and no after my experiment withcgroups, on the 4GB machine.Any suggestions on what other factors could explain why the 4GB is able to successfully work with the
cgrouplimit, but not the 500GB machine?