Problem Background
I used UnixBench to test the performance on gcc-8.5.0 and gcc-10.3.0, and found that the performance dropped by about 10%! I finally located that it was because gcc-10.3.0 had -fno-common as the default option. (-fcommon is the default option on gcc-8.5.0).
Looking in the gcc manual I found this description of fcommon:
In C code, this option controls the placement of global variables defined without an initializer, known as tentative definitions in the C standard. Tentative definitions are distinct from declarations of a variable with the extern keyword, which do not allocate storage.
The default is -fno-common, which specifies that the compiler places uninitialized global variables in the BSS section of the object file. This inhibits the merging of tentative definitions by the linker so you get a multiple-definition error if the same variable is accidentally defined in more than one compilation unit.
The -fcommon places uninitialized global variables in a common block. This allows the linker to resolve all tentative definitions of the same variable in different compilation units to the same object, or to a non-tentative definition. This behavior is inconsistent with C++, and on many targets implies a speed and code size penalty on global variable references. It is mainly useful to enable legacy code to link without errors.
Question
- Why is there such a big performance difference between -fcommon and -fno-common in the same environment?
- I suspect that the CPU(Sapphire Rapids) in my environment is not adapted to gcc. If so, how can I locate or solve this problem on gcc-10.3.0?
- What exactly are the risks if you change -fno-common back to -fcommon?
I don't know gcc very well. If someone could help answer these questions, even one, it would be greatly appreciated.