|
Description
|
The fix for:
6296770 process robust mutexes should be much faster
did what it says; it made process-shared robust mutexes faster.
Unfortunately, it also introduced a race condition problem wherein
the lock could be acquired (the lock byte set) without the ownerpid
field being set, as in this snippet of code from libc's synch.c :
if (*lockp == 0 && set_lock_byte(lockp) == 0) {
mp->mutex_owner = (uintptr_t)self;
mp->mutex_ownerpid = udp->pid;
error = 0;
break;
}
set_lock_byte(), when it returns zero, has atomically set the lock byte
in the mutex. For the process-shared lock to be properly held, the
mutex_ownerpid must also be set to the process's pid. This does not
happen atomically, but rather two lines farther down.
If the process dies due to a signal from outside or due to some other
thread incurring a SIGSEGV before this thread reaches the assignment
of mutex_ownerpid, the robust lock will be left locked, but with no
owner. The kernel code that marks the mutex EOWNERDEAD will not be
triggered when the process dies because the mutex_ownerpid field
does not match the dying process's pid. Other processes attempting
to acquire the lock will block forever, even though they should
acquire the lock with EOWNERDEAD.
See the attached file, robust_test.c .
Compile and then run this test case several times on
a multiprocessor and it will eventually report:
ERROR: process appears to be wedged, pid = 4761
mutex lockword = 0xff000001 ownerpid = 0x00000000
showing that the lock byte is set, but the ownnerpid field is not.
To fix this problem without resorting to always calling the kernel
on every acquisition and release of a process-shared robust lock,
we need to arrange for both the lock byte and the ownerpid field
to be set in one atomic operation.
Happily, this is possible because both of these fields are contained
in one 64-bit word of the mutex and because all machines supporting
Solaris support 64-bit atomic operations. We can use atomic_cas_64()
for both setting and releasing the process-shared lock.
There is one more race condition, in the mutex_unlock() code
of mutex_unlock_process():
mp->mutex_owner = 0;
mp->mutex_ownerpid = 0;
DTRACE_PROBE2(plockstat, mutex__release, mp, 0);
old_lockword = clear_lockbyte(&mp->mutex_lockword);
if ((old_lockword & WAITERMASK) &&
(release_all || (old_lockword & SPINNERMASK) == 0)) {
ulwp_t *self = curthread;
no_preempt(self); /* ensure a prompt wakeup */
(void) ___lwp_mutex_wakeup(mp, release_all);
preempt(self);
}
We can fix the clear_lockbyte() portion of this code to use atomic_cas_64()
to clear both the lock byte and the mutex_ownerpid field atomically, but the
call to ___lwp_mutex_wakeup() occurs later. If the process dies after
clearing the lock but before calling ___lwp_mutex_wakeup(), we will have
missed a wakeup and a thread waiting for the mutex could sleep forever
even though the lock is clear.
This can be fixed in the kernel code that deals with process-shared
robust locks when a process dies. In that code, if the kernel finds
an unowned robust lock with the waiter bit set, it can just wake up
a waiter (or all waiters if the lock is marked NOTRECOVERABLE).
If such a wakeup was not necessary, no harm is done because the
affected thread will just loop around and try the lock and
go back to sleep again, if necessary.
Finally, there is a problem with the kernel code for dealing with
a locked process-shared robust mutex when the mapping containing
the lock is unmapped. This was uncovered while running the tests
for the fix to:
6545064 Posix message queue should fail reliably when a client process crashes
The problem is that the same process-shared lock can be mapped more than
once into a single process (as the SUSv3 POSIX conformance test suite does
when testing message queues). When one of the mappings is unmapped, it
may happen that the lock is currently held via one of the other mappings.
This is not an error, but the kernel code cannot know this. The fix
for this problem is simply to declare it as a non-problem and do not
cause the lock to become LOCK_UNMAPPED or LOCK_OWNERDEAD just because
it is being unmapped by munmap(). The original USYNC_PROCESS_ROBUST
implementation assumed single-threaded processes sharing a lock and
was not MT-safe.
See the suggested fix.
|