|
Description
|
Two different users have reported seeing the installation of a Linux zone
from RPMs hang. In both cases, the users were running in a virtualized
environment - one in Parallels and one in VMware. One user reported the
problem happening on UFS only, and the other on both UFS and ZFS. In
both cases, running 'truss' on the installer would cause the problem
to go away.
While both users reported being able to reproduce the hang on demand, we
were unable to reproduce it on our systems at all. One user was good
enough to provide us with a pstack from the hanging rpm process:
# pstack 1114
1114: /bin/rpm -ivh --force --aid --nosignature --root /a
SysVinit-2.85-4.4.
d2ef5b85 sigsuspend (8043590)
d2fd7e57 lx_rt_sigsuspend (8043668, 8, d2787bfc, d2785078, 8043668,
8043650) + 3f
d2fcecde lx_emulate (804361c) + 1f6
d2fe07fb ???????? (8043700, d277f8bd, 8043668, 20, 8043668, 2)
d2780090 __pthread_sigsuspend (8043668, 20, 8043668, 2, 0, 0) + 14
d277f8bd __pthread_wait_for_restart_signal (d27850a0, 0, 83961bc,
d277c1b0, d27850a0, d2851d18) + 51
d277bf91 pthread_cond_wait (83961bc, 83961a4, 8043758, d277e1e4,
d2787bfc, d2748940) + fd
d2840595 ???????? (8396168, d2748940, 462, d274a0e0, 0, 8396168)
d28406ad rpmsqWait (8396168, 0, 80437d8, d2840295, 11, 0) + fd
d2973045 ???????? (8396168, 422, 0, 80438bc, 0, 1)
d29735e7 ???????? (8396168, 86d3bb8, d2999e93, 1, 8043910, 0)
d2973d5e ???????? (8396168, 0, 86d3bb8, 8072d28, d296f850, 0)
d29763b6 rpmpsmStage (8396168, 35, 4, 8043b2c, 1, 0) + 1c66
d2975d20 rpmpsmStage (8396168, 4, d274ab10, d26926a1, 48, d2787bfc) + 15d0
d2976197 rpmpsmStage (8396168, 7, d299d0d7, 5d7, 807a310, 92) + 1a47
d2995741 rpmtsRun (8072d28, 0, 74, 1d7, 0, 0) + 1371
d2983963 rpmInstall (8072d28, d299edc0, 80698b0, 805b0e0, 0, d2f867e0) + 853
0804b718 ???????? (1de, 8044034, 80447b0, d274a688, 0, d2f7e250)
d263ebd1 __libc_start_main (804aba0, 1de, 8044034, 8057654, 805769c,
d2f7eafc) + 8d
0804a931 ???????? ()
This pstack gives the information needed to track down the problem.
Short version: rpm is broken
Long version:
In addition to the bits that are simply unpacked into the file system,
RPMs can include post-install scriptlets that are run after the initial
unpacking. These scriplets are run by a child process of rpm.
The way this should work is that runScript() forks a child to run the
scriplet and calls psmWait() to wait for the child process to finish.
When the child finishes, the kernel sends a SIGCHLD to the parent process,
which reaps the exit status and tells psmWait() to return to the caller.
Inside psmWait(), it calls rpmsqWait(), which in turns calls
rpmsqWaitUnregister(). rpmsqWaitUnregister() blocks SIGCHLD, unblocks
SIGCHLD, and waits for the signal handler to signal a condition variable.
The Right Way to use a cv is something like:
while (val == foo)
pthread_cond_wait(&val_cond, &val_mutex);
This cleanly handles any race condition in which the other thread runs
first, since val serves as a flag indicating that the condition has been
satisfied. It also handles spurious wakeups, which are less of an issue
with a non-buggy thread library.
rpm doesn't do this. Instead it just does:
ret = sighold(SIGCHLD);
[...]
xx = sigrelse(SIGCHLD);
ret = pthread_cond_wait(&sq->cond, &sq->mutex);
xx = sighold(SIGCHLD);
The assumption seems to be that the sighold() will prevent the SIGCHLD from
being handled until rpm is ready to handle it.
But...what happens if the child finishes before the initial sighold()? Or
for that matter, what if the SIGCHLD arrives between the calls to
sigrelse() and pthread_cond_wait()? It seems like the signal handler would
run, signal the cv, and then exit. This thread would then block the
signal, unblock the signal, and wait forever on a cv that will not be
signalled again.
This would give us exactly the same stack provided by Tony.
A comment in rpmsqWaitUnregister() suggests that there is some higher level
logic that ensure the child won't even run until after the sighold() call.
I don't actually believe this theoretical logic exists. Even if it does
exist, it doesn't remove the race window between the sigrelse() and
pthread_cond_wait() call.
I think the rpm code should actually be:
xx = sigrelse(SIGCHLD);
while (sq->reaped == 0)
ret = pthread_cond_wait(&sq->cond, &sq->mutex);
xx = sighold(SIGCHLD);
Actually, unless I'm missing something else, the sigrelse()/sighold() calls
can go away. They appear to be just a useless, broken attempt to prevent
this race condition. (of course one wonders: if the author was aware of
the race condition, why not do a better job of preventing it?)
|