OpenSolaris

Printable Version Enter a New Search
Bug ID 6575901
Synopsis libc`sharefs() and ld.so have conspired to kill smdiskless
State 10-Fix Delivered (Fix available in build)
Category:Subcategory library:libc
Keywords
Responsible Engineer Thomas Haynes
Reported Against snv_63
Duplicate Of
Introduced In solaris_nevada
Commit to Fix snv_73
Fixed In snv_73
Release Fixed solaris_nevada(snv_73) , solaris_10u6(s10u6_01) (Bug ID:2156402)
Related Bugs 6371468 , 6492577 , 6492581 , 6551814
Submit Date 30-June-2007
Last Update Date 29-April-2008
Description
We recently upgraded our diskless server, only to find out that the
smdiskless(1M) command no longer works:

    # /usr/sadm/bin/smdiskless add -- -i 10.13.22.153 -e 0:3:ba:50:bd:e9
      -n ophel -x os=sparc.sun4u.Solaris_11 -x root=/export/root/ophel
      -x swap=/export/swap/ophel
    Authenticating as user: root

    Type /? for help, pressing <enter> accepts the default denoted by [ ]
    Please enter a string value for: password ::
    Starting Solaris Management Console server version 2.1.0.
    endpoint created: localhost/127.0.0.1:898
    Solaris Management Console server is ready.
  Loading Tool: com.sun.admin.osservermgr.cli.OsServerMgrCli from oversee
    Login to oversee.PRC.Sun.COM as user root was successful.
    Download of com.sun.admin.osservermgr.cli.OsServerMgrCli from oversee
  
->  EXM_RMIERROR
    #

Some research reveals that this obscure error actually corresponds to the
Solaris Management Console crashing.  Indeed, we find diagnostics from the
Java VM at /hs_err_pid24306.log (yes, in / itself; yikes!), which has the
offending stack trace:

    C  [libc.so.1+0x44818]    strlen+0x18
    C  [libsmoss.so+0x17e50]  setup_server_info+0x158
    C  [libsmoss.so+0xe438]   smossdcadd+0x110
    C  [libsmoss.so+0x13cdc]  smossdcadd_jni+0x74
    [ ... ]

Looking at libsmoss.so, we can see that setup_server_info+0x158 doesn't
directly call strlen(), but rather calls a function called sharefs():

    > setup_server_info+0x158::dis -n2
    setup_server_info+0x150: call      +0x28ed8      <PLT:sharefs>
    setup_server_info+0x154: mov       0x1, %o5
    setup_server_info+0x158: tst       %o0
    setup_server_info+0x15c: bne,pn    %icc, +0x14   <setup_server_info+0x170>
    setup_server_info+0x160: add       %fp, -0x4, %o1

Looking at the libsmoss.so source, we find the call to sharefs():
 
    if (clientroot && *clientroot) {
        /*
         * Invoke method for sharing client's root directory:
         * Let DFSTYPE default to "nfs" and we don't want any description
         */
-->         err = sharefs(NULL, workbuffer, NULL, clientroot,
                NULL, 1, real_pathname, &mgmt_cntxt->log);

The sharefs() function itself *also* lives in libsmoss.so:

     int
     sharefs(char *dfstype, char *options,
         char *description, char *pathname,
         char *takeeffect, int mode,
         char *real_pathname,
         SM_log *log)
     {

... and thus one might reasonably expect this function to be called from
the above call site.  However, as part of 6371468, the sharefs() function
was moved from libshare.so.1 to libc.so.1.  Since every application links
with libc.so.1, this means there's now another sharefs() afoot in the
symbol namespace.  Further, because of (longstanding, but illogical,
surprising and plain downright broken) way ld.so resolves dlopen()'d
objects, the sharefs() in libc will be used instead of the one in
libsmoss.so.  Since libc`sharefs() has:

    int
    sharefs(enum sharefs_sys_op opcode, struct share *sh)
    {
        uint32_t                i, j;

        /*
         * We need to know the total size of the share
         * and also the largest element size. This is to
         * get enough buffer space to transfer from
         * userland to kernel.
         */
->      i = (sh->sh_path ? strlen(sh->sh_path) : 0);
        sh->sh_size = i;

... and libsmoss.so passes a `const char *' for its second argument, we
thus go down in a ball of flames on the marked line.
Work Around
To workaround this specific problem, compile the following as a shared
object with "/opt/SUNWspro/SS11/bin/cc -Kpic -D_REENTRANT -G sharefs.c"

   #include <dlfcn.h>
   #include <stdlib.h>
   #include <sys/types.h>
   #include <unistd.h>
   #include <string.h>
   #include <stdio.h>
   
   static int (*csharefs)();
   static int (*ssharefs)();
   
   #pragma init(sharefs_init)
   
   void
   sharefs_init(void)
   {
        void *ch, *sh;
   
        ch = dlopen("/lib/libc.so.1", RTLD_NOW);
        sh = dlopen("/usr/sadm/lib/wbem/libsmoss.so", RTLD_NOW);
        if (ch == NULL || sh == NULL)
                abort();
   
        csharefs = (int (*)())dlsym(ch, "sharefs");
        ssharefs = (int (*)())dlsym(sh, "sharefs");
        if (csharefs == NULL || ssharefs == NULL)
                abort();
   }
   
   int
   sharefs(void *a, void *b, void *c, void *d, void *e, void *f, void *g,
        void *h)
   {
        if (c == NULL)
                return (ssharefs(a, b, c, d, e, f, g, h));
   
        return (csharefs(a, b));
   }

Then make it system-wide preload:

    # crle -e LD_PRELOAD=/path/to/sharefs.so

The above workaround is quite crude and fragile; it relies on the fact
that libsmoss always calls sharefs() with a third argument of NULL.
However, since libc`sharefs() only takes two arguments, it's possible that
the third argument may still end up being NULL.  YMMV.
A slightly more robust workaround that figures out which sharefs is
meant based on the caller's frame.  Still lame, but not dependent
on stack junk.

   #pragma ident	"%Z%%M%	%I%	%E% SMI"
   
   #include <dlfcn.h>
   #include <stdlib.h>
   #include <sys/types.h>
   #include <unistd.h>
   #include <string.h>
   #include <stdio.h>
   #include <ucontext.h>
   
   static int (*csharefs)();
   static int (*ssharefs)();
   
   #pragma init(sharefs_init)
   
   static void
   sharefs_init(void)
   {
   	void *ch, *sh;
   
   	ch = dlopen("/lib/libc.so.1", RTLD_NOW);
   	sh = dlopen("/usr/sadm/lib/wbem/libsmoss.so", RTLD_NOW);
   	if (ch == NULL || sh == NULL)
   		abort();
   
   	csharefs = (int (*)())dlsym(ch, "sharefs");
   	ssharefs = (int (*)())dlsym(sh, "sharefs");
   	if (csharefs == NULL || ssharefs == NULL)
   		abort();
   }
   
   /* ARGSUSED */
   static int
   gather_smoss(uintptr_t pc, int sig, void *arg)
   {
   	Dl_info dli;
   	boolean_t *smossp = arg;
   
   	*smossp = (dladdr((void *)pc, &dli) != 0 &&
   	    strstr(dli.dli_fname, "libsmoss.so") != NULL);
   
   	return (1);
   }
   
   int
   sharefs(void *a, void *b, void *c, void *d, void *e, void *f, void *g,
        void *h)
   {
   	ucontext_t uc;
   	boolean_t smoss;
   
   	(void) getcontext(&uc);
   	(void) walkcontext(&uc, gather_smoss, &smoss);
   	
   	if (smoss)
   		return (ssharefs(a, b, c, d, e, f, g, h));
   
   	return (csharefs(a, b));
   }
Comments
N/A