I have strace'd a correctly working program to compare with a faulty one. In the correct one, I have two threads (using pthreads) with one waiting for the other with regular pthread_join. Under the hoods I can see that the waiting (primary) thread waits on a futex (looking into sources and verifying with GDB the address matches &pd->tid). The part I don't understand is that there's no matching FUTEX_WAKE on this address when the second thread exits:
2579217 futex(0x7fe28bfff910, FUTEX_WAIT_BITSET|FUTEX_CLOCK_REALTIME, 2579218, NULL, FUTEX_BITSET_MATCH_ANY <unfinished ...>
...
2579218 madvise(0x7fe28bf00000, 1024000, MADV_DONTNEED) = 0
2579218 exit(0) = ?
2579217 <... futex resumed>) = 0
2579218 +++ exited with 0 +++
2579217 exit_group(0)
I have suspected the robust futex ABI to leave the wakeup to kernel, but the list seems to be empty. I think that it would make sense to wakeup (or segfault?) waiting threads if the memory with futex word gets unmapped but this does not seem to be the case. What makes kernel wake up this primary thread?
Thanks to the hints in comments I was able to figure out what's happening, so only to explain things a bit further: the wakeup is documented in
clonefunction through theCLONE_CHILD_CLEARTIDflag and can be set or changed later during the life of the thread using theset_tid_addresssyscall.One more thing that helped me debugging is trace from bcctools - I was able to debug kernel through