roger.faulkner@Eng 2001-06-17
Single-threading and multi-threading
------------------------------------
We (Sun) have been of two minds concerning user-level multithreading
for ten years. On the one hand we say that it is good, that it
enables applications to use the full power of our multiprocessor
machines, that it promotes application and system scalability.
On the other hand, we view multithreaded programs and libraries
with fear and loathing, believing them to be riddled with bugs,
causing us untold hours of debugging pain, and actually causing
performance problems rather than solving them.
So to satisfy our two minds on the subject, we have caused Solaris
to evolve into a system with two process models, a single-threaded
process model (not linked with libthread) and a multi-threaded process
model (linked with libthread).
This leads to fundamental problems:
- Libraries like libnsl that want to create helper threads have
complex code to do one thing if the application they are linked
with is single-threaded and another if it is multi-threaded.
- Some libraries use multi-threading and satisfy their need for
libthread by linking with it when the shared object is built.
Such libraries can still be dlopen()d by an application and
when a single-threaded application does so, it suddenly becomes
multi-threaded, a condition for which it was not built.
- libc has become quite complex over time. It has to operate
in either process model and be prepared to switch to the
multi-threaded model at the moment the application dlopen()s
libthread. The complexity of this logic itself leads to
performance problems.
- New developments like thread local storage (TLS), requiring
cooperation between the compilers and the Solaris libraries,
can only be accomplished in a multi-threaded process model.
This means that the Solaris libraries themselves cannot take
advantage of TLS, since they must be prepared to operate with
both single-threaded and multi-threaded applications.
- Customers forget to use the -mt flag for their compiles
and waste their time and ours debugging non-problems.
The list goes on.
There is historical justification for the fear and loathing that has been
applied to the multi-threaded process model, because before Solaris 9 it
did not match the semantics of the single-threaded model: Signals were
handled poorly and certain interfaces were restricted. Daemon threads
and LWPs were required by the old libthread; multi-threaded applications
did not reduce gracefully to the single-threaded model with only one LWP
when they did not use any thread other than the main thread. There were
bugs that could not be fixed due to the design of the old libthread.
The new libthread in Solaris 9 does not suffer from the problems listed
in the previous paragraph. Multi-threaded applications reduce gracefully
to the traditional UNIX process model when only one thread is active.
No restrictions are applied to system interfaces beyond the traditional
UNIX process model. An otherwise single-threaded application can be
linked with the new libthread with no difference in behavior other than
the fact that system library locks become real locks, not the stub
functions in libc that do nothing and return success. Until more threads
are created, all such locks are uncontended.
Static Linking and Dynamic Linking
----------------------------------
We have also been of two minds concerning dynamic linking.
On the one hand, we insist that applications must be dynamically
linked with Solaris libraries, and in particular with the Solaris
Application Binary Interface (ABI), or else we make no guarantees
of binary compatibility from release to release or even from
patch to patch. On the other hand, we have never made the hard
decision to stop shipping and using Solaris 32-bit static libraries.
We did make this decision for our 64-bit world; no Solaris 64-bit
static libraries are shipped to customers, at least from the OS-NET
consolidation, and there have been no evil consequences.
Applications statically linked with Solaris libraries, at least
those statically linked with /usr/lib/libc.a, represent yet a
third process model. Many features available with dynamic linking
are missing: locales, threads, etc. System interfaces are the
system call traps, not the libc library functions. We have
allowed applications to link with our implementation rather than
with our interfaces. We have polluted our library code with
ifdefs to discriminate between these modes of access.
How are these two problems related?
-----------------------------------
The main problem with the static libc and threading is that the static
libc cannot contain any multithreading interface functions. All
multithreading interface functions require initialization to have
taken place before main() is called. That is, %g7 must be initialized
with a pointer to the user-level thread structure (%g7 on sparc; the
%gs segment register is used on x86). This occurs via dynamic
initialization and cannot occur with a static threading library.
Folding libthread into libc means that we cannot have a libc.a unless
we have special code in all of the thr_*() and pthread_*() source files
to suppress their contents when being compiled statically (non-PIC).
Also, statically linked programs cannot assume they are running in a
multithreaded environment. This puts a constraint on all library code
(at least libraries that are compiled for both static and dynamic
linking). There must be ifdefs to take care of the two different
possible process models. No library code can take advantage of the
newly-provided compiler-supported thread local storage (TLS). All of
this is a maintenance and development problem.
Now, assuming that we chose to embrace this maintenance and development
problem and continue to deliver a static libc.a while at the same time
folding libthread into the dynamic libc, there would still remain the
unresolvable problem of partially static applications:
Eleven of the binaries in /sbin are partially static, including
/sbin/init; this has become the new trend. They are static with
everything other than -ldl. They do this so they can dlopen() stuff
that they need but that doesn't exist on the root filesystem.
A dlopen() of almost anything pulls in the dynamic version of libc.
Much, but not all, of libc is already in the binary, and the dynamic
linker will resolve calls from the dlopen'd libc to those libc functions
already in the binary. But the libc functions already in the binary
were not compiled for multithreading. The process would suddenly become
multithreaded, calls would be made from the dynamic libc into the
application's copies of static components of libc, and chaos would ensue.
Conclusion
----------
Complexity is the enemy. It would be an enormous simplification, both
to ourselves and to our customers, if Solaris presented only one process
model to the world, rather than three. We can do this by folding
libthread into libc and ceasing to deliver any Solaris static libraries.
The implication of ceasing to develop a static libc.a is that we can no
longer produce static binaries of our own. Solaris utilities like
/sbin/init must become dynamically linked, with all necessary libraries
residing in the root filesystem (to be usable before /usr is mounted).
Having decided not to ship a static libc.a, there remains no reason
to ship any other static libraries. The problems described above
with respect to partially static applications should be reason enough.
Therefore, this project requires doing all of the following things at once:
- Folding libthread into libc
- Eliminating Solaris static (archive) libraries
- Converting Solaris static and partially-static executables
into dynamically linked executables, including /sbin/init
- Moving libc.so.1 and ld.so.1 and a few other shared libraries
into the root filesystem (/lib) so they are available before
/usr is mounted
The project does not require the following things and they are not being done:
- Forcing _REENTRANT to be globally defined
- Taking away the ability to create and use static libraries
- Creating any new interfaces