The following code uses condition variable and a monitor flag to synchronize an operation between the main thread and thread2:
int main() {
std::mutex m;
std::condition_variable cv;
std::atomic<bool> ready = false;
std::thread thread2 = std::thread([&](){
std::unique_lock<std::mutex> l(m);
cv.wait(l, [&ready]{return ready.load();});
std::cout << "Hello from thread2\n"; // 3 should print after 1
});
std::cout << "Hello from main thread\n"; // 1 we want this to be 1st
ready = true; // 2, store to an atomic bool, without a lock, is it OK?
cv.notify_one();
thread2.join();
std::cout << "Goodbye from main thread\n";
}
In the code above, we use atomic<bool> for the monitor flag ready both so the read and write to this flag will not create a data race (a non-issue for most if not all platforms, but still UB "by the book") and to avoid reordering of the lines marked with 1 and 2 (the default store for atomic variable is memory_order_seq_cst which guarantees that everything that happened-before the store in this thread would be a visible side effect in the thread that performs a load for this varaible).
However, the code does not lock the modification of the ready flag (which is atomic) and the call to notify_one.
From this SO post it is clear that it is OK to leave the call to notify_one without a lock, it might even be more efficient as we do not want thread2 to be awake following the call to notify_one and then see that it should wait for a lock and be sent for a sleep by the os-scheduler, till the lock is released.
However, it is not clear whether the modification of the ready flag shall be done in a locked scope (using the same mutex used for the read), or using atomic<bool> is enough?
Update of the
readyflag MUST be locked with the same mutex used for the read(And then the boolean may become a simple bool, instead of
atomic<bool>).According to cppreference:
This blog post is explaining quite nicely why a lock is required, and why using atomic is not enough. A similar explanation can be found in this SO post (on a related similar scenario) and in this additional SO post which lists the reasons for using a lock even with atomic variables. A very similar question is already discussed and explained also here and here.
The problem
Without a lock we may fall into the following race condition:
readyflag, it is false, it plans to start waiting on the condition variable (by calling the basiccv.wait(lock)operation), but it is still before this call.readyflag totrueand is quick enough to callcv.notify_one()while thread2 is not waiting on the condition variable yet.cv.wait(lock)and hangs forever, as the notification was already sent and was "lost".Let's prove the race condition
To prove that the race condition, when not locking, is real, we can add a sleep in thread2 that mimics a valid timing scenario:
Adding this sleep actually makes thread2 to hang, QED: locking the modification of the
readyflag is absolutely required, it's not just theoretical.Solution: lock!
The following version solves the issue, by locking the modification of the
readyflag, and we do not need now the flag to beatomic:How does it solve the race presented above?
The main thread cannot set
readytotruewhile the lock is owned by thread2. Since the lock is owned by thread2 till it is released inside the call tocv.wait(lock)(only when the wait starts the lock is released) the main thread will not be able to modify thereadyflag before thread2 starts waiting on the condition variable. Thus the call in main tocv.notify_one()is guaranteed to happen when thread2 is already in wait state.A note on the usage of
unique_lockfor waiting on the conditional_variable, andlock_guardfor settingreadytotrue: the first must useunique_lock(this is the API forwaitwhich needs to callunlockinternally), the second may use both but we go withlock_guardwhich is simpler (see also: std::unique_lock<std::mutex> or std::lock_guard<std::mutex>?).Waiting with a timeout
It is to be noted that we may prefer waiting with a timeout (it is a general good advice to prefer waiting with a timeout, to avoid deadlocks and to have better traceability over threads status). In case we wait with a timeout we may decide to waive the locking and go back for the
atomic<bool>, with something like this:There is of course a timing issue with this solution, as we may wait additional time (the timeout duration) for thread2 to make its operation, but if the operation is not of high priority this can be a valid solution.