Is locking necessary in this function?

79 Views Asked by At

Here is an example of a mutex lock.

#include <pthread.h>

pthread_mutex_t count_mutex;
long long count;

void increment_count()
{
    pthread_mutex_lock(&count_mutex);
    count = count + 1;
    pthread_mutex_unlock(&count_mutex);
}

long long get_count()
{
    long long c;
    
    pthread_mutex_lock(&count_mutex);
    c = count;
    pthread_mutex_unlock(&count_mutex);

    return (c);
}

The document says that "The increment_count() function uses the mutex lock simply to ensure an atomic update of the shared variable". Ok, that's fine.

I have a problem with the way it explains the use of locking in get_count(): "The get_count() function uses the mutex lock to guarantee that the 64-bit quantity count is read atomically".

If I'm not wrong, right after unlocking count_mutex in get_count another thread can call increment_count and make the result from get_count incorrect. That's inevitable, right? Then why not just make this?

long long get_count()
{
    return count;
}
2

There are 2 best solutions below

3
Yakk - Adam Nevraumont On BEST ANSWER

The risk is not the wrong value, but an incoherent value.

Ie, getting a value that either count never held.

Suppose count goes from -1 to 0. And the reader gets the high bytes while it is -1 and the low bytes while it is 0. The reader reads 0xffffffff00000000, which is roughly 2^-32.

This is significantly different than any value it has ever held.

Things get even worse on some platforms. One problem is multi-core cache consistency. Locking operations map to not only mutually exclusive critical sections, but also requiring cache lines to be made consistent between CPU memory caches.

You can literally never see a change to a memory location if you don't get a lock in your thread.

2
John Bollinger On

Is locking necessary in this function?

TL;DR: YES.

From the perspective of the C language specification, the issue is:

Two expression evaluations conflict if one of them modifies a memory location and the other one reads or modifies the same memory location.

[...]

The execution of a program contains a data race if it contains two conflicting actions in different threads, at least one of which is not atomic, and neither happens before the other. Any such data race results in undefined behavior.

(C17 5.1.2.4/4,35; emphasis added)

Protecting both the read and the write via the same mutex is what ensures that there is a "happens before" relationship between every read and every write. (Side note: what the spec means there by "atomic" is not what your document means when it says that protecting the write with a mutex makes it atomic.)

Undefined behavior applies to the whole program, not just the specific actions that trigger it, but the effects on the read are among the easiest to think about. If the read were performed without locking the mutex, then it might

  • Observe a value different from the variable's initial value and from any value that was written by any thread, OR

  • Never observe values written by other threads at all, OR

  • Over multiple reads, observe changes to the variable's value in a different order than they occurred, OR

  • In combination with reads of other variables, observe a collection of values of different variables that never existed in memory together at the same time.