OpenSolaris

Printable Version Enter a New Search
Bug ID 6485372
Synopsis kill failed to check zombie process group
State 10-Fix Delivered (Fix available in build)
Category:Subcategory kernel:brandz
Keywords
Responsible Engineer Edward Pilatowicz
Reported Against
Duplicate Of
Introduced In solaris_nevada
Commit to Fix snv_56
Fixed In snv_56
Release Fixed solaris_nevada(snv_56)
Related Bugs 6475920
Submit Date 24-October-2006
Last Update Date 25-January-2007
Description
part of linux kill man page:
.........
ESRCH  The  pid or process group does not exist.  Note that an existing
              process might be a zombie, a  process  which  already  committed
              termination, but has not yet been wait()ed for.
...........


in the test case, we fork a child process and make it self a new group and let it exit without parent waiting, then send kill(-child_pid,0) to check kill return value and errno, should all be zero to following man page saying.

this is the case, same like the case in 6463442, only kill(-child_pid,0):
miin:/tmp> cat kill_13.c
#include <stdio.h>
#include <stdlib.h>
#include <signal.h>
#include <errno.h>
#include <sys/wait.h>
#include <sys/types.h>
#include <unistd.h>
#include <string.h>

#define TEST_SIG        SIGKILL
#define NCHILDREN       5

int
main(int argc, char **argv)
{
        pid_t pid1, pid2, rcpid;
        int exno, nsig, status, i, ret;

        status = 1;
        exno = 1;

        switch (pid1 = fork()) {
        case -1: /* Here pid is -1, the fork failed */
                perror("fork failed!");
                break;

        case 0: /* pid of zero is the child */
                printf("CHILD: child pid=%d\n", getpid());

                setsid();

                /*NOTREACHED*/
                printf("CHILD: child exit...\n");
                exit(0);

        default: /* pid greater than zero is parent getting the child's pid */
                printf("PARENT: father pid=%d, child pid=%d\n", getpid(), pid1);                sleep(3);
                ret = kill(-pid1, 0);

                if(ret != 0 && errno == ESRCH) {
                        printf("PARENT: kill(-%d, 0) failed with ESRCH\n", pid1);                        printf("PARENT: expected success or a different error condition\n");
                } else {
                        printf("PARENT: pass!, ret=%d, errno=%d(%s)\n", ret, errno, strerror(errno));
                }
        }

        /*NOTREACHED*/
        return(0);
}


run it in native linux OS:
miin:/tmp> uname -a
Linux miin 2.4.21-4.ELsmp #1 SMP Fri Oct 3 17:52:56 EDT 2003 i686 i686 i386 GNU/Linux
miin:/tmp> ./kill_13
CHILD: child pid=6726
CHILD: child exit...
PARENT: father pid=6725, child pid=6726
PARENT: pass!, ret=0, errno=0(Success)
miin:/tmp>

run it in solaris global:
bash-3.00# zonename
global
bash-3.00# uname -a
SunOS covete 5.11 onnv-gate:2006-10-23 i86pc i386 i86pc
bash-3.00# ./kill_13
CHILD: child pid=128397
CHILD: child exit...
PARENT: father pid=128396, child pid=128397
PARENT: pass!

run it in linux zone:
[root@lxzone1 tmp]# hostname
lxzone1
[root@lxzone1 tmp]# ./kill_13
CHILD: child pid=128405
CHILD: child exit...
PARENT: father pid=128404, child pid=128405
PARENT: kill(-128405, 0) failed with ESRCH
PARENT: expected success or a different error condition
the fix for this bug partially overlaps with:
	6475920 pidof doesn't work in lx branded zones

currently, zombie pids are not stored in our pid has table,
so lookups for zombie pids via lx_lpid_to_spair() will always
fail.  this issues is fixed by 6475920.

once 6475920 is fixed, there is a remaining problem in:
	usr/src/uts/common/brand/lx/syscall/lx_kill.c:lx_kill()

specifically if we try to send a signal to a process group who's leader
has exited then we'll call pgfind() with a negative pid value, it
will return NULL, and we won't send any signals.
Work Around
N/A
Comments
N/A