Fill boost::unordered_set quickly

132 Views Asked by At

I am trying to construct as quick as possible a set of elements based on the result of an image classification.

In the details, I would like to store in this set all the (r, g, b) pixels which belong to a certain class. The problem has 2 classes, I would like to retain the pixels which are from class 1 and discard the pixels from class 0. The classification is done using a trained mlpack classifier on a (r, g, b) vector of double.

I have to use a boost::unordered_set<uint32_t> for this task or similar. The code up to now looks like this

boost::unordered_set<uint32_t> bset;

for (int r = 0; r < 256; r++)
{
    for (int g = 0; g < 256; g++)
    {
        for (int b = 0; b < 256; b++)
        {
            arma::rowvec xp = { (double)b, (double)g, (double)r };
            if ((bool)(clf.Classify(xp))) {
                uint32_t cachePos = r + (g << 8) + (b << 16);
                bset.insert(cachePos);
            }

        }
    }
}

I have made some benchmarks and the slowest part is the insertion with insert(). To scan all the possible (r, g, b) it takes about 5 seconds. Since the code is called from a GUI I would like it to be faster to reduce the time a user has to wait for the result.

First I tried to change .insert() with .emplace() but as I expected there is little improvement.

I also tried filling another container, actually std::vector was quite fast, and the copying its content in the set using iterators:

std::vector<int> predictions;
for (int r = 0; r < 256; r++)
{
    for (int g = 0; g < 256; g++)
    {
        for (int b = 0; b < 256; b++)
        {
            arma::rowvec xp = { (double)b, (double)g, (double)r };
            if ((bool)(clf.Classify(xp))) {
                uint32_t cachePos = r + (g << 8) + (b << 16);
                predictions.push_back(cachePos);
            }

        }
    }
}

bset = boost::unordered_set<uint32_t>(predictions.begin(), predictions.end());

But still, the last line takes a lot of time, around 2-3 seconds. Do you have any hint for me? What can I do to improve the speed of my code? Are there faster container that I can use to replace boost::unordered_set ? The container should contain elements from class 1 only.

1

There are 1 best solutions below

2
Alexey Astakhov On

Since unordered_set can not predict number of elements to be inserted, it performs allocation and insertion for each particular element. In your case it's exactly 256 x 256 x 256 = 16,777,216 times. The bottleneck in this situation is memory allocation. Since vector consumes memory by chunks - it inserts much faster. The solution is to use the 'reserve' method:

boost::unordered_set<uint32_t> bset;
bset.reserve(256*256*256);

for (int r = 0; r < 256; r++)
{
    for (int g = 0; g < 256; g++)
    {
        for (int b = 0; b < 256; b++)
        {
            arma::rowvec xp = { (double)b, (double)g, (double)r };
            if ((bool)(clf.Classify(xp))) {
                uint32_t cachePos = r + (g << 8) + (b << 16);
                bset.insert(cachePos);
            }

        }
    }
}