C11 memory fence and atomic operation

149 Views Asked by At

I'm studying about memory barriers. I have some questions about following code.

//version 1
Thread A:
    *val = 1;
    atomic_thread_fence(memory_order_release);
    atomic_store_explicit(published, 1, memory_order_relaxed);

Thread B:
    if (atomic_load_explicit(published, memory_order_relaxed) == 1) {
            atomic_thread_fence(memory_order_acquire);
            assert(*val == 1); // will never fail
    }

//version 2
/* Thread A */
    *val = 1;
    atomic_thread_fence(memory_order_release);
    *published = 1;

/* Thread B */
    if (*published == 1) {
        atomic_thread_fence(memory_order_acquire);
        assert(*val == 1); /* may fail */
    }
  1. Does atomic_thread_fence only affect atomic loads/stores, and does it have any impact on the compiler or only for cpu?
  2. In version 2, where the store to published is non-atomic, how can it lead to a failed assertion due to the use of atomic_thread_fence, which is only meant for atomic loads/stores?
  3. Why is *val = 1 not written as atomic_store_explicit(val, 1, memory_order_relaxed)?
1

There are 1 best solutions below

0
Nate Eldredge On
  1. Fences do affect non-atomic loads and stores. For instance, a load or store, whether atomic or not, must not be reordered before an acquire fence. Otherwise the fence wouldn't be able to establish the necessary synchronization. "Reordered" includes compile-time reordering of instructions in memory, and run-time out-of-order execution; a fence has to inhibit them both.

  2. It's not really that the fence is "only meant for atomic" operations. It's simply that, assuming published is non-atomic in version 2, then you have a data race on published: you have two non-atomic accesses in different threads, at least one of them a write, and no synchronization to make one of them happen-before the other. So the program's behavior is undefined.

    The fences aren't a problem here, it's just that they don't do anything to help avoid the data race. Release/acquire fences are only effective when used together with an atomic load that observes the value of an atomic store. In other contexts, they are harmless but also useless.

  3. In version 1, *val is safe to access non-atomically. You have a release fence followed by a store (to published, of the value 1), and a load that, if it observes the store, is followed by an acquire fence. This is exactly the setup of 7.17.4p2 in the C17 standard, so the release fence synchronizes with the acquire fence (assuming that the acquire fence is actually reached). Therefore your store of *val happens-before your load of *val (if the load occurs at all), so there is no data race on *val, and the load is guaranteed to observe the stored value (5.1.2.4p20). There is also no data race on published because it is atomic.