Take the following piece of code:
#include <stdbool.h>
bool global;
bool foo(void) {
if (global) {
global = false;
return true;
}
return false;
}
Is this code theoretically equal to the following?
#include <stdbool.h>
bool global;
bool baz(void) {
bool tmp = global;
global = false;
return tmp;
}
I am inclined to think that foo and baz are theoretically equivalent: global is not volatile and there is nothing else forcing the compiler to consider multithreading (something like atomic_bool, for example). So I would think that if a branch is way more expensive than a store, then the compiler would opt to go with the baz solution instead of the foo one. But I cannot get the compiler to actually do this, taking the GCC 13.2.0 RISC-V compiler with -std=gnu2x -march=rv32if -mabi=ilp32f -O3 -mbranch-cost=2000 still results in foo. On the other hand, I am also unable to make the compiler turn baz into foo.
So am I wrong and is there a theoretical difference between foo and baz? Or is this just an optimization GCC (RISC-V?) does not take?
Modifying an object is what C formally calls a side effect (c17 5.1.2.3), and compilers aren't allowed to optimize out "needed side effects", but "needed" is quite subjective...
The C standard doesn't really say anything about if compilers are allowed to add new side effects not present in the original code. It just says that the compiler is not allowed to affect the outcome of the program - the "observable behavior". Accessing a
volatileobject affects the observable behavior, but that does not apply here.Replacing the former function with the latter is a manual optimization that at least the programmer can do. I'm not sure why neither gcc nor clang optimizes the code to remove the branch, but we tend to over-estimate optimizers at times - they don't work by magic.
However, if you change the code to
Then only one side effect is "needed" and one can be removed, since the object is not
volatileand writes to plain variables have no special meaning other than being side effects. The extra write here does get optimized out.Note that the compiler likely has no clue how the external linkage variable
globalmay be used elsewhere in the program, which might limit possible optimizations.