Performance of Clang's _bitInt(256) vs Boost Multiprecision int256_t

190 Views Asked by At

I'm after the fastest 256 bit integer library (which isn't a nightmare to integrate).

As part of this I'm trying to get a rough idea of the performance comparison between Clang's _Bitint(256) and Boost multiprecision's int256_t.

I've currently got this for Clang's _BitInt(256):

#include <cstdint>
#include <iostream>

using int256_t = signed _BitInt(256);

int main()
{
    for(int i = 0; i < 200; ++i)
    {
        // Using __rdtsc() for something non-deterministic
        const int256_t a = __rdtsc() * __rdtsc() * __rdtsc() * __rdtsc() * __rdtsc() * __rdtsc();
        const int256_t b = __rdtsc() * __rdtsc() * __rdtsc();

        const uint64_t start = __rdtsc();
        const int256_t c = a / b;
        const uint64_t finish = __rdtsc();

        std::cout << finish - start << " " << static_cast<int64_t>(c) << std::endl;
    }
}

https://godbolt.org/z/9M9TG16ax

but it looks like the divide is getting completely optimized-out? I've tried to use some randomness in the 256 bit division using __rdtsc(). I usually print the calculated value to prevent dead code elimination, but ostream isn't supported for bitint(256) so I had to do a hacky static_cast.

Could anyone suggest how I could profile this?

Or if there's any faster, header-only 256 bit integer library?

0

There are 0 best solutions below