This code works with GCC with less than -O2 optimizations or in clang.
constexpr uint16_t KEY_BITS = 70;
constexpr unsigned __int128 KEY_BITS_MASK = (((unsigned __int128) 1) << KEY_BITS) - 1;
struct Entry
{
unsigned __int128 left : 4;
unsigned __int128 right : 4;
unsigned __int128 key : KEY_BITS;
};
Entry data;
void print(unsigned __int128 a)
{
std::cerr << std::bitset<16>(a >> 64) << std::bitset<64>(a) << std::endl;
}
void store(unsigned __int128 key, uint8_t left, uint8_t right)
{
data.left = left;
data.right = right;
data.key = key & KEY_BITS_MASK;
print(key & KEY_BITS_MASK);
print(data.key);
assert(data.key == (key & KEY_BITS_MASK));
}
In GCC with -O2 it seems to be undefined since adding or removing the prints changes the behavior. In the "bad" case it converts the result of the & operation to a 64 bit integer first.
Is this a bug or am I misunderstanding something about how these types should behave. It doesn't seem like it should be undefined.
Compiler explorer: https://godbolt.org/z/he7Kx6rMr
How bitfields are implemented is "implementation defined" and not part of the C++ standard. Meaning bitfields can be padded with extra bits to align them in memory. This means they cannot safely be used to store actual bits. They only ensure the result of setting and reading behave like the variable had the specified number of bits.
Seems cppreference bit_field also contains that information (look at the notes section)
And the C++ standard documentation on bitfields [class.bits], which also mentions that bit fields are implementation defined (and may or may not contain padding)