Addressing mmap's random memory allocation for efficient data sharing across processes

83 Views Asked by At

I'm working on a project where I need to load a large array into memory, construct various indexes and derivative data, and, after complete loading and processing, ensure maximum speed of access to this data from other processes without the overhead of using SQL/NoSQL and Inter-Process Communication (IPC). Additionally, the process that utilizes these processed data needs to start almost instantly.

Is it feasible to use mmap as a replacement for the heap in this scenario to achieve rapid access to memory data created by another process without needing to reprocess it?

Furthermore, I'm facing a challenge with the dynamic allocation of addresses by mmap, which allocates random addresses in the address space. Is there a way to alter the behavior of mmap in this regard? Should I consider using some form of indexes or offsets? Or is there an entirely different approach that would be more suitable for this task?

2

There are 2 best solutions below

1
rostamn739 On BEST ANSWER

One can use a shared memory framework for what you describe. I see C tag but if C++/Boost is an option, one can use Boost.Interprocess which takes care of synthetic pointers and DSM segment allocation for you.

I don't see the application point of memfd mentioned in previous answer as a normal dynamic shared memory is totally applicable in your scenario. It's just a tmpfs backed regular file in Linux.

0
Tanmay Patil On

mmap would solve your problem. Its behaviour is exactly what you need.

The address provided by mmap is the virtual memory address local to your process. Regard it as base address and use offsets for accessing contents.

When you mmap same file in a different process, it would give another address which would be virtual memory address local to the new process. Again regard that address as base address for accesses from new process and work with offsets as before.

Filesystem will make the data accessible to all processes via file name. If you don't want the data accessible to other processes you can use memfd_create and send file descriptor using other IPC mechanism.