I'm checking CPUID flags on x86-64 for instruction set extensions, and caching the results in a static array. The function doing this may be called from any of several different threads, so race conditions are possible - but does this matter? Any thread will produce the same results, so I don't care about conflicting writes, and checking the same flags more than once is a non-issue. My conclusion is that this is safe, but am I missing something?
Here's a stripped-down version of what I'm doing:
enum class Arch : int
{
SSE2,
AVX,
MAX
};
static bool CheckCPUID(Arch arch)
{
// Check CPU features
}
static bool IsSupportedArch(Arch arch)
{
enum class TriVal : int
{
Unknown,
True,
False
};
constexpr int arch_max = static_cast<int>(Arch::MAX);
const int i_arch = static_cast<int>(arch);
// Cache results in-memory:
static TriVal cached_results[arch_max - 1] = { TriVal::Unknown, TriVal::Unknown };
if (cached_results[i_arch] == TriVal::Unknown)
{
cached_results[i_arch] = CheckCPUID(arch) ? TriVal::True : TriVal::False;
}
return cached_results[i_arch] == TriVal::True;
}
Conflicting writes to non-atomic objects are a data race and a data race always causes undefined behavior. The compiler may optimize based on that.
Therefore you must care about this.
You can use a
std::array<TriVal, arch_max - 1>for thestaticlocal variablecached_resultsand have it be set in its initializer. The initialization of static local variables is guaranteed to be thread-safe and executed exactly once (assuming C++11 or later).Alternatively you can mark the variable
thread_local, but in that case every thread will call CPUID once.If you really care that CPUID is called only as few number of times as required, you can make the array elements
std::atomics:In that case it is ok to have concurrent writes and as you say you don't care about which one ends up being the value. You probably want to replace the loads and stores with relaxed memory order operations though, since you don't need the default
seq_cstguarantees: