OpenSolaris

Printable Version Enter a New Search
Bug ID 6679853
Synopsis fix for 6647542 causes machines with separate root and /usr to fail boot
State 10-Fix Delivered (Fix available in build)
Category:Subcategory library:libc
Keywords no-s10
Responsible Engineer Roger Faulkner
Reported Against
Duplicate Of
Introduced In solaris_nevada
Commit to Fix snv_87
Fixed In snv_87
Release Fixed solaris_nevada(snv_87)
Related Bugs 6647542
Submit Date 25-March-2008
Last Update Date 9-April-2008
Description
The putback for this CR:

6647542 POSIX scheduling should be compatible with Solaris scheduling classes

introduced a severe failure mode for machines with separate
root (/) and /usr disk partitions.

The problem is caused by libc's initialization section.  It interrogates
the real-time class interface to determine the real-time class id,
in this line of code:

      self->ul_rtclassid = get_info_by_policy(SCHED_FIFO)->pcc_info.pc_cid;

Juergen Keil figured it out.  He said:

    Root cause is that libc_init needs to be able to load the sched/RT 
    kernel module, but the RT module lives in the /usr filesystem.  When
    init(1M) starts, /usr isn't mounted yet (if your system is setup
    to use a separate /usr filesystem).  For this reason
    get_info_by_policy(SCHED_FIFO) returns NULL, and we're crashing in
    libc_init.    8-/

The net result is that init fails at boot and the kernel continues
fruitlessly to restart it.  You get a string of these messages:

    WARNING: init(1M) exited on fatal signal 11: restarting automatically
    WARNING: init(1M) exited on fatal signal 11: restarting automatically
    WARNING: init(1M) exited on fatal signal 11: restarting automatically
    WARNING: init(1M) exited on fatal signal 11: restarting automatically
    ...

See the suggested fix.
Work Around
N/A
Comments
N/A