How to implement large vector initialization that compiles with gcc-4.4?

550 Views Asked by At

I have a list of 20k known strings that I know at compile time and will never change. A sort of non-configurable dictionary. I do not want to load it in run time from a file, because this would imply a lot of unnecessary architecture: finding the file in a certain path, a configuration file to indicate the path, etc.

I came up with a solution like this in C++:

In a.cpp:

std::vector<std::string> dic;
dic.reserve(20000);
#define VECTOR_DIC_ dic;
#include values.inl
#undef VECTOR_DIC_

then in the values.inl, a lis of 20k push_back calls, like this:

VECTOR_DIC_.push_back("string1");
VECTOR_DIC_.push_back("string2");
...
VECTOR_DIC_.push_back("string20000");

This code compiles and works properly with gcc-4.8 on Debian but fails to compile with gcc-4.4, gcc-4.4 never finishes to compile the a.cpp file.

Why does gcc-4.4 not support this type of large initialization? Also, is there a design pattern for such large initialization for known values at compile time?

2

There are 2 best solutions below

0
Holt On BEST ANSWER

Use an array of const char * and then initialize your vector from it:

#include <string>
#include <vector>

char const * const S[] = {
    "string1",
    "string2"
};

const std::size_t N_STRINGS = sizeof(S) / sizeof(*S);

const std::vector<std::string> dic(S, S + N_STRINGS);

This compiles fine (did not test with 20k strings though) with g++ 4.4.7.

0
Michaël Roy On

The compiler probably balks because the initialization is not inside a function.

To make it work, insert the initializers inside a function.

As in:

std::vector<std::string> dic;  // wouldn't an std::set be a better match?

bool InitDitionary() {
  dic.reserve(20000);
  #define VECTOR_DIC_ dic;
  #include values.inl
  #undef VECTOR_DIC_
  return true;
}

// you can then call InitDictionary at your discretion from within your app
// or the following line will initialize before the call to main()
bool bInit = InitDictionnary();

Or, the static const char* alternative is also viable, you'd have to change you strings file to this format, I suggest you include the entire declaration, since it's probably generated by software. The array should be sorted beforehand , so you can search it using binary_search, upper_bound, etc....

const char dic[20000] = {  // <-- optional, in the file, so you have the number of items 
    "string1",
    "string2",
    "string3",
    "string4",
    // ...
};
const size_t DIC_SIZE = sizeof(dic) / sizeof(dic[0]);  // :)

You can either give the file a .cpp extension, or include as:

#include "dictionary.inc"