I am trying to make a simple tool that requires some lookup on a fixed key-value dataset, so I try to lazily throw all data to a hashmap in the header file:
/** main.h */
#include <unordered_map>
#include <cstdint>
using namespace std;
const unordered_map<uint64_t, const char * const> test = {
{0xDEADC0DE, "Some short text less than 50 characters"},
// 46K rows of data
};
I haven't implemented anything yet, but just including this header file is enough to crash the compiler.
main.cpp
/** main.cpp */
#include <iostream>
#include "main.h"
int main() {
return 0;
}
After maxing out a CPU core for 5 minutes, the g++ (cc1plus) eats up all 32GB of RAM and crashes. I know a large header could impact compiling performance but I did not expect it to exhaust resources and fail. How does it use up 32GB RAM when the size of the header file is only 1.9 MB? Could someone please help explain the problem in my case?
The version I am using is g++ (GCC) 13.2.1 20230801, with the command /usr/bin/g++ -O3 -DNDEBUG -o CMakeFiles/main.cpp.o -c /home/foo/main.cpp
Update
I also did some experiments with different sizes of map:
| Element Number | Build Time |
|---|---|
| 10 | 00:00:01.043 |
| 100 | 00:00:01.187 |
| 1000 | 00:00:05.225 |
| 2000 | 00:00:10.200 |
| 5000 | 00:00:25.604 |
| 10000 | 00:00:52.208 |
| 20000 | 00:01:48.090 |
Update
The problem is solved by disabling compiler optimization. I am using the VS Code CMake extension and the Release profile adds -O3 to the g++ argument. Removing this allows the project (46K rows) to be compiled in 6 seconds. The compiler must be trying hard to cast some optimization magic that unfortunately goes wrong.
The large header file is killing your performance. Don’t bully your compiler!
Imagine
includeing<main.h>—which as you say has over forty-six thousand (46000 + 1) elements— in every source file that needs it. That would mean you duplicate the large objecttesteverywhere you include it and your compiler is forced to preprocess the header and compile it everywhere it is included. This is bad! really bad!!Like I mentioned in my comment, this object
testshould be in a translation unit, should have static storage duration and should have external linkage. This is so that it will be compiled once, will live till program termination and can be made available in other translation units by using theexternkeyword to refer to it.test.cppmain.cppAnywhere you want to use
test, simply declare itexternand you will be referring to the same object in static storage. This way, you will avoid copyingtestand also save some compilation costs.