I'm optimizing a compression algorithm, which uses a structure that spans 2 bytes. But there are times when I'd like it to interpret just 1 byte, as its members that (I expect) to map onto the 2nd byte are never written or read.
Do I have any guarantee that the compiler will not access the 2nd byte so long as zFmt and wFmt are never accessed? If not, can I compose a static assertion that will stop compilation when this assumption is wrong?
struct Header {
uint8_t xFmt : 4;
uint8_t yFmt : 4;
uint8_t zFmt : 4; // must not be read/written when header is mapped to 1 byte
uint8_t wFmt : 4; // must not be read/written when header is mapped to 1 byte
};
static_assert( sizeof(Header) == 2 && alignof(Header) == 1, "alignment vital");
// --- usage ---
int main(){
// Header may be placed into memory where it overlaps only one byte;
// in that case, it's .zFmt and .wFmt members are never read or written to
char buffer[1];
Header * header = new (buffer) Header;
// can I be sure (or statically assert) that these instructions
// will only read and write to the nearest (and only) owned byte?
header->xFmt = 0;
header->yFmt = 0;
header->xFmt += 1;
header->yFmt += 1;
}
Side notes:
The algorithm currently works, but I want to make sure it doesn't rely on undefined behavior. I believe strict-aliasing is adhered to by using placement new, but maybe that assumption is incorrect?
Also, I want to use this structure and bit-fields, in this way... because they look nice! Not the best reason haha, so if this isn't possible, my fallback is to interpret the bytes without a structure, as uint8_t with shifts and masks. I'm also aware I could perform slicing using inheritance, which I'll research if this is undefined.
Answering my own question, since I believe I've found the answer in a 2020 edition of the C++ ISO standard (bolding the relevant parts):
The note about "two or more threads of execution" accessing separate memory locations, leads me to believe the following:
yFmtcan't cause an observable side-effect. As pointed out by Nate's comment, it's possible for the compiled code to still load and store to where it thinkszFmtlives, while upholding this requirement. But if the access must be atomic and preserve the previous value, (and so long as the program owns this memory), then there's only one behavior that's possible. (As for the program owning the memory, so long as the atomic instruction works on memory aligned more granularly than the program can own memory, which I believe is the case, then I'm comfortable assuming I can't cause an access violation in this way.)Along with the rule that all members and bit-fields of a structure must have an increasing address, I believe this makes my usage well-defined, with the following change:
Not saying it's not prone to causing programmer mistakes of course, but I'm fairly convinced this is well-defined, at least for the time being.