High-load C++ logging

510 Views Asked by At

I’m using spdlog library in my high-load service. Also my service is getting requests from another service by gRPC.

At this moment I can’t use streaming gRPC, and every call creates its own connection. As I know I can’t restrict threads count gRPC creating.

Now my service began to run into performance due to the fact that each thread writes to the same log file, which is blocked by a mutex inside the spdlog.

What are the solutions to this problem besides horizontal scaling (increasing the number of processes)

2

There are 2 best solutions below

0
RandomBits On

If I understand your requirements correctly, there is an application with multiple threads logging to a single archive synchronously using spdlog.

I answered a similar question about logging suggesting the use of a memory-mapped file which turned out to be about 100x faster than spdlog for the prototype code posted in the answer.

0
freakish On

Mutex itself is not a problem. What is a problem is that the library probably does i/o (file writes) under mutex (although honestly I do not know the details). And it can get even worse if network drives (or even services) are involved.

So, one of the possible solutions is to delegate this heavy i/o onto a separate, dedicated thread. And then each other thread will communicate with this dedicated thread by passing messages (logs) to a thread-safe (preferably lock-free) queue. With this approach it is extremely likely that logging won't be a bottleneck anymore.

The drawback is that log processing becomes detached with this approach. This has two negative consequences:

  1. Logging may lag behind calls. Meaning you keep processing data, but logging gets bigger and bigger lag. So not only logs don't arrive on time, but also, in case of system failure you will loose all those pending logs. In practice however, this is very rarely a real issue.
  2. Under heavy load the queue may grow faster than the dedicated thread is capable of processing it. And therefore you may run out of memory. This can be solved by putting a semaphore somewhere and not allowing the queue to grow too large. So it is a fallback to blocking calls, but it has to be done at some threshold, otherwise you risk crash.

With such design logging should not be a problem anymore.