|
Description
|
Look at gettxt(3C) source code and forcus on saved_locale variable
which is global and is not protected by any lock.
Then you can know gettxt(3C) is a MT-Unsafe function.
----- usr/src/lib/libc/port/gen/gettxt.c -----
static char *saved_locale = NULL;
char *
gettxt(const char *msg_id, const char *dflt_str)
{
....
if (saved_locale != NULL && strcmp(curloc, saved_locale) == 0) {
for (i = 0; i < db_count; i++)
if (strcmp(db_info[i].db_name, msgfile) == 0)
break;
} else { /* new locale - clear everything */ <----- (1)
if (saved_locale) <----- (2)
free(saved_locale); <----- (3)
if ((saved_locale = malloc(strlen(curloc)+2)) == NULL) <-(4)
return (handle_return(dflt_str));
(void) strcpy(saved_locale, curloc); <----- (5)
Let's say there are three threads in a process and they call gettxt(3C)
simultaneously. Then three threads comes at (1) almost at the same time
because there is no lock function in the gettxt(3C).
If thread-2 and thread-3 execute at (2) after thread-1 executes at (4),
both thread-2 and thread-3 executes at (3). This means free(3C) for the
same address is called twice. This causes inconsistency of malloc(3C) area.
Eventually process may dump a core.
Obviously this code shows gettxt(3C) is a MT-Unsafe funxtion.
I created a test program which multithread calls gettxt(3C) and executed
it on a multi-CPU system. Then it dumps core. We think this also shows
gettxt(3C) is MT-Unsafe.
----- test log -----
$ cat test_gettxt.c
#include <sys/types.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <errno.h>
#include <thread.h>
#include <nl_types.h>
#include <locale.h>
#define THRS 16
void *
t_func(void *arg)
{
int i;
char *str, msgid[80];
i = thr_self();
poll(0, NULL, 500);
sprintf(msgid, "FJSV%02d_msg:%d", i, (i%10)+1);
str = gettxt(msgid , "Message file is not found.\n");
thr_exit(NULL);
}
int
main(int argc, char *argv[])
{
thread_t tid;
void *t_func(void *);
int i;
for (i = 0; i < THRS; i++) {
if (thr_create(NULL, NULL, t_func, NULL, THR_BOUND, &tid)==-1){
fprintf(stdout, "Fail to thr_create():%d\n", errno);
exit(1);
}
}
while (thr_join(NULL, NULL, NULL) == 0);
return (0);
}
$ cc -mt test_gettxt.c -o test_gettxt
$ psrinfo
0 on-line since 06/30/05 09:05:48
1 on-line since 06/30/05 09:05:48
2 on-line since 06/30/05 09:05:48
3 on-line since 06/30/05 09:05:48
$ while .
>>>> do
>>>> ./test_gettxt
>>>> done
Bus Error - core dumped
$ mdb test_gettxt core
Loading modules: [ ]
>>>> $c
libc.so.1`_free_unlocked+0x40(4300869e, ff33c008, 4300869e, ff33c008, 0, 0)
libc.so.1`free+0x20(4300869e, 4300869e, 0, ff33c008, 0, 0)
libc.so.1`_free_tsd+0x44(21968, ff2ef1f0, 0, ff386000, 0, 0)
libthread.so.1`tsd_exit+0x6c(0, ff271c00, 0, ff387600, 0, ff271c00)
libthread.so.1`_thrp_exit+0x68(ff271c00, 0, 0, ff386000, fe4fbed0, ff386000)
libthread.so.1`_t_cancel+0x104(ff271c00, fe4fbee0, 0, ff3682ec, 4d, 10bd0)
libthread.so.1`_thr_exit_common+0xc4(ff271c00, 1, 1, 0, ff33ec98, fe4fbf44)
t_func+0x7c(0, ff271c00, 0, 0, 0, 0)
libthread.so.1`_lwp_start(0, 0, 0, 0, 0, 0)
We executed a same test program on Solaris9 with libumem.
% cc -mt test_gettxt.c -o test_gettxt -lumem
% ldd ./test_gettxt
libumem.so.1 => /usr/lib/libumem.so.1
libthread.so.1 => /usr/lib/libthread.so.1
libc.so.1 => /usr/lib/libc.so.1
libdl.so.1 => /usr/lib/libdl.so.1
/usr/platform/FJSV,GPUS/lib/libc_psr.so.1
$ while .
>>>> do
>>>> ./test_gettxt
>>>> done
Abort - core dumped
% mdb ./test_gettxt ./core
Loading modules: [ libumem.so.1 libthread.so.1 libc.so.1 ld.so.1 ]
>>>> $c
libc.so.1`_lwp_kill+8(6, 80, 0, fe5fb9b8, 0, ff35eaf4)
libumem.so.1`umem_do_abort+0x18(1, a, a000000, 7efefeff, 81010100, ff00)
libumem.so.1`umem_err_recoverable+0x74(ff36064c, 3ffb8, ff360630, fe5fb9d0,
ff2bf8d5, 0)
libumem.so.1`free+0x54(3ffb8, 3ffb9, ffffffff, 0, 43, 0)
libc.so.1`gettxt+0x24c(ff2bed60, 10c20, 10c20, 5, 0, 0)
t_func+0x70(0, 0, 0, 0, 0, 0)
libthread.so.1`_lwp_start(0, 0, 0, 0, 0, 0)
This stack shows free(3C) called in gettxt(3C) cause inconsistency of
malloc area.
gettxt(3C) seems to be MT-Unsafe, though manual say Safe with exceptions.
xxxxx@xxxxx.com 2005-07-01 06:54:26 GMT
the CU hope to fix the source code of gettxt(3C) but not document(man page).
here is the CU's request.
---------------
A multithread application which uses getxt(3C) dumped a coreon our customer beca
use gettxt(3C) is MT-Unsafe.
Other many multithread application may use gettxt(3C) and may
dump a core in the future.
Fixing manual means that multithread application which calls
gettxt(3C) must be modified not to call gettxt(3C).
How does Sun announce this issue to customer and software
developer?
This issue is similar to BugID#4408502. In BugID#4408502 case,
Sun modified lfmt(3C) source code and release a libc patch.
This action doesn't require fixing application code to the customer
and software.
---------------
I think the CU's request is reasonable,we need to fix the gettxt(3C) itslef.
|