Do I need to use volatile?

462 Views Asked by At

Consider the following code:

public class MyDataStructure {

    int size;
    final ReentrantLock lock = new ReentrantLock();

    public void update() {
        lock.lock();
        try {
            // do coll stuff
            size++;
        } finally {
            lock.unlock();
        }
    }

    public int size() {
        return size;
    }
}

As far as I understand, ReentrantLock imposes an happens-before relationship so size++ needs to affect the main memory and the cache (of other threads) with the new value. Hence, no need to use volatile for size. Also, for the same reason the method size() can simply return size.

Is my understanding correct?

3

There are 3 best solutions below

8
Mattias Isegran Bergander On BEST ANSWER

No.

Other threads may see a stale (old) value of size.

When other threads execute size() there is nothing there instructing it to ensure it's not reading an old value, such as a locally cached value in another cpu cache etc. Also worth mentioning is hoisting by the jvm.

An example, if you have a loop calling size() in one thread, it may never exit if update() is not called in the loop (or size changed directly), and only called/changed from other threads.

while (size() == 0) {
   ...
}

The jit compiler, (at least the modern part what used to be called the c2/server compiler) could happily optimize away (hoist) the size variable checks as it sees that it never changes in the loop.

Update about alternatives:

volatile could be helpful if there is only one thread that will ever write to the variable size, including calling update(). Otherwise it wouldn't be protected as size++ is both first reading and then updating (writing) the variable so two threads could still read a fresh copy "at the same time" with the same value and both add +1 but instead of +2 it could then be a total of +1. Even if there is only one writer I would argue against it as this could change in the future, and is a fairly subtle thing to have in the code so a future developer (including self) would stand a great risk of missing this.

So one option could be to add lock to the size() function as well. Possibly nice with the extra features of locks, or even use readwrite locks that would allow many readers but only one writer etc. It's not as readable (well rather verbose) as the other alternatives. Possibly nice with the future virtual threads too if it was a costly operation (it isn't in this case though).

Another option is just the traditional simple way of adding synchronized to both methods, or a synchronized (this) { ... } block, that would beyond doubt provide guarantees about exclusion and memory visibility. Only real drawback: This implies using that object instance as the mutex/monitor so might not be granular enough if other unrelated variables might need protection as well or others having a references to this object using it. You could then add special monitor/mutex objects for each. Pattern, a field each:

private final Object sizeMutex = new Object();

... and synchronize on those synchronized (sizeMutex) { ... } when needed, but then it starts to be more and more verbose/complex but still fairly obvious and understandable. Biggest risk would be the possibility of introducing deadlock. Most likely not even needed to be that granular, but good to think about.

Simplest option in this particular case though is to use the AtomicInteger or related classes for other primitives or even entire objects as others recommend in the comments.

private final AtomicInteger size = new AtomicInteger();

size() {
   return size.get();
}

update() {
   size.incrementAndGet();
}
0
Solomon Slow On

As far as I understand, ReentrantLock imposes an happens-before relationship...

Yes. When some thread W (W for "writer") unlocks a lock, that "happens before" some other thread R (for "reader") subsequently locks the same lock.

In order to take advantage of that relationship, you have to lock the same lock in both methods:

public class MyDataStructure {

    int size;
    final ReentrantLock lock = new ReentrantLock();

    public void update() {
        lock.lock();
        try {
            size++;         // happens before unlock() because things that happen
                            //  in one thread all happen in _program order._
        } finally {
            lock.unlock();  // Happens before a pending lock.lock() call in some
                            //  other thread.
        }
    }

    public int size() {
        lock.lock();        // Happens before return size because...
                            //  ...program order.
        try {
            return size;
        } finally {
            lock.unlock();
        }
    }
}
0
leonov On

The accepted answer is correct, but the Update about alternatives section is wrong. Maybe it was added after the answer was accepted.

In the scenario described by the OP, the semantics of volatile are the same as if the size() method was using the same lock as the update() method, or if they were both synchronized.

  1. Regardless of anything else, there cannot be any race condition when calling update() no matter how many threads call it. This is the whole point of having a lock. So it's not possible for two threads to call update() and have the value only incremented by 1. The lock ensures a Happens-Before relationship for any fields inside it.

  2. What volatile means in plain language is every thread will always see the most recent value of size. Which, in this case, is exactly the same as locking in the size() method or having both methods use synchronized.

  3. There seems to be confusion about what a compound operation is. It is absolute true that multiple threads calling size() and update() may see the same value multiple times. This is because size() and update() are independent operations and there is no guarantee as to the order in which they are called.

    Any algorithm which depends on both of these methods for correctness (e.g. if(size() < 3) update()) is using them as a compound operation. Which means the code must be synchronized externally.