|
Description
|
Due to a change in the implementation of userland mutexes introduced
by CR 6296770 in snv_69 and KU 137111-01, objects of type mutex_t and
pthread_mutex_t must start at 8-byte aligned addresses. If this
requirement is not satisfied, an application on Solaris/SPARC will
fail with a SEGV with a call stack similar to the following:
[1] *_atomic_cas_64(0x141f2c, 0x0, 0xff000000, 0x1651, 0xff000000, 0x466d90)
[2] set_lock_byte64(0x0, 0x1651, 0xff000000, 0x0, 0xfec82a00, 0x0)
[3] fast_process_lock(0x141f24, 0x0, 0x1, 0x1, 0x0, 0xfeae5780)
...
or similar callstacks containing the functions mutex_trylock_process.
The first argument to fast_process_lock above is the mutex,
and it is only 4-byte aligned, not 8-byte aligned. Failures are also
likely on Solaris/x86. The failure mode would not be a SEGV,
because x86 allows unaligned accesses, but unaligned accesses are
not guaranteed to be atomic, so the mutex may become inconsistent,
causing unpredictable failures. The failures only occur when
using inter-process mutexes (eg, type USYNC_PROCESS or
PTHREAD_PROCESS_SHARED)
Note that the mutex_t and pthread_mutex_t structures have long been
defined to require 8-byte alignment. They contain a upad64_t member,
which contains a double, even for 32-bit applications. The natural
alignment of a double is 8 bytes, and structures must be aligned
according to their strictest member, per the SPARC Compliance
Definition 2.4. Applications which create merely 4-byte aligned
mutexes are technically non-compliant. However, previously the
mutex implementation only performed 4-byte accesses, so unaligned
mutexes worked. The new implementation performed 8-byte atomic
operations, which fail for unaligned mutexes, so non-compliant
applications now fail.
Compilers recognize the 8-byte requirement that is implicit
in the definition of mutex_t, so statically defined variables
and stack variables are aligned properly. malloc and related
allocators return 8-byte aligned addresses, so structures that
are dynamically allocated using these routines are also safe.
However, if an application performs address arithmetic before
assigning addresses to mutexes or to structures containing mutexes,
or if an application implements its own allocation routine,
then such applications will fail if they only guarantee
4-byte alignment in their assigned addresses.
|
|
Work Around
|
If the problem is an application-specific general-purpose allocation
function, then interpose a new allocation on top of the deficient
allocator using the LD_PRELOAD trick. The new routine should round
up the requested size by 4 bytes, call the original routine, test the
returned pointer, and increment it by 4 if it is only 4-byte
aligned.
If the problem is address arithmetic, there is no easy workaround,
aside from fixing the application to guarantee 8-byte alignment and
become compliant.
|