libc mustn't use defread internally
[dep, 09Dec2008]
When (manually, grr) running savecore on my opensolaris box today I
witnessed the following bizarre behavior:
# savecore
System dump time: Thu Dec 4 14:35:35 2008
Constructing namelist n double quotes (") or single quotes (')./unix.0
Constructing corefile n double quotes (") or single quotes (')./vmcore.0
...yet the crash dump was correctly saved in the proper directory.
After poking around, I found the intruding text in /etc/default/init.
savecore obtains the name of the crash dump directory using
defopen/defread to read /etc/dumpadm.conf. But it makes no other use
of these functions. It turns out, however, that *libc* is using
these routines when reading the default time zone from
/etc/default/init. When printing the dump time, savecore unknowingly
recycled defread's internal buffer, invalidating the pointer returned
to it earlier. Fortunately, it was only used for functional purposes
before printing the time.
The reason this is seldom seen is because TZ is usually set in the
environment, obviating the need to look up the default time zone.
Apparently on opensolaris TZ *isn't* set in the environment, forcing
all applications down this code path.
This problem was introduced when 6480998 replaced localtime.c's
hand-coded /etc/default/init parser with a call to defread. While
the code is now much simpler, it potentially breaks all callers to
defread that also use the time routines in libc.
Work Around
TZ is set by init and login. So, in the Solaris environment, TZ
is available in the most case. However, in the OpenSolaris environment,
gdm does not passes the TZ setting to its children. Therefore,
the shell from the GUI login does not have TZ setting. So, a workaround
for this problem is to set TZ before invoking savecore.
The savecore process that may be invoked by the /system/dumpadm service
will not cause this problem because it will have TZ setting being
inherited from init.