OpenSolaris

Printable Version Enter a New Search
Bug ID 6620393
Synopsis evch_evq_evzalloc assumes kmem_alloc_tryhard cannot fail
State 10-Fix Delivered (Fix available in build)
Category:Subcategory kernel:sysevent
Keywords bzero | install | kernelbase | panic | u5fma2
Responsible Engineer Gavin Maltby
Reported Against snv_76
Duplicate Of
Introduced In solaris_10
Commit to Fix snv_79
Fixed In snv_79
Release Fixed solaris_nevada(snv_79) , solaris_10u5(s10u5_08) (Bug ID:2157968)
Related Bugs 4226895 , 6576925
Submit Date 23-October-2007
Last Update Date 7-December-2007
Description
During installation of snv_76(ON) on a system, toody, it panic'd. The panic was 100% reproducible and had not been seen in ON-PIT previously. 

The panic was seen on multiple occasions and did not necessarily always happen at the exact same stage of the install process (though it was always fairly late into it). 

The machine the panic was seen on is more commonly used for Network Storage PIT and the panic was seen during that test run. However, to determine which consolidation the panic was being caused by, it was decided to install snv_76(ON), without the latest NWS bit. The panic was observed.

The output of 2 separate panics are as follows;



panic[cpu1]/thread=c9170de0: bzero: arguments below kernelbase



c9170ab8 unix:bzero+18 (0, 834)

c9170ae4 genunix:evch_evq_evzalloc+56 (824, 4)

c9170b24 genunix:sysevent_evc_alloc+86 (fead0cec, fead0cdc,)

c9170bf4 genunix:sysevent_evc_publish+d4 (eee0c130, fead0cec,)

c9170c30 genunix:fm_ereport_post+6c (f3654f60, 4)

c9170d00 cpu.generic:gcpu_ereport_post+203 (c1b8f000, 4, fecae8)

c9170d40 cpu.generic:gcpu_mca_drain+a3 (0, c1b8f000, c1b61e)

c9170d74 genunix:errorq_drain+f6 (c130d940)

c9170d84 genunix:errorq_intr+e (c130d940, 0)

c9170db0 unix:av_dispatch_softvect+66 (1)

c9170dcc unix:dispatch_softint+1d (0, 0)



panic: entering debugger (continue to save dump)

panicsys+0x329(fec01a7e, c9170ab8, fec4276c, 1)
vpanic+0xc3(fec01a7e, c9170ab8)
panic+0x12()
bzero+0x18(0, 834)
evch_evq_evzalloc+0x56(824, 4)
sysevent_evc_alloc+0x86(fead0cec, fead0cdc, c9170b50, d, 7cc, 4)
sysevent_evc_publish+0xd4(eee0c130, fead0cec, fead0cdc, fead0cd4, fead0fb0, 
f3654f60)
fm_ereport_post+0x6c(f3654f60, 4)
cpu.generic`gcpu_ereport_post+0x203(c1b8f000, 4, fecae844, febceee8, 21080813, 
d4224002)
cpu.generic`gcpu_mca_drain+0xa3(0, c1b8f000, c1b61e00)
errorq_drain+0xf6(c130d940)
errorq_intr+0xe(c130d940, 0)
av_dispatch_softvect+0x66(1)
dispatch_softint+0x1d(0, 0)
switch_sp_and_call+0xf(c9170ddc, fe81d2bc, 0, 0)
dosoftint+0x47(c9128d44)
do_interrupt+0x112(c9128d44, c9045fc0)
_interrupt+0xe7()
mach_cpu_idle+0xd()
>> More [<space>, <cr>, q, n, c, a] ?                                       cpu_idle+0x8e()
>> More [<space>, <cr>, q, n, c, a] ?                                       idle+0xde(0, 0)
>> More [<space>, <cr>, q, n, c, a] ?                                       thread_start+8()
[1]>   ::status
debugging live kernel (32-bit) on toody
operating system: 5.11 snv_76 (i86pc)
CPU-specific support: AMD
DTrace state: inactive
stopped on: debugger entry trap
[1]> $c
kmdb_enter+0xa()
debug_enter+0x27(fe8d7a68)
panicsys+0x329(fec01a7e, c9170ab8, fec4276c, 1)
vpanic+0xc3(fec01a7e, c9170ab8)
panic+0x12()
bzero+0x18(0, 834)
evch_evq_evzalloc+0x56(824, 4)
sysevent_evc_alloc+0x86(fead0cec, fead0cdc, c9170b50, d, 7cc, 4)
sysevent_evc_publish+0xd4(eee0c130, fead0cec, fead0cdc, fead0cd4, fead0fb0, 
f3654f60)
fm_ereport_post+0x6c(f3654f60, 4)
cpu.generic`gcpu_ereport_post+0x203(c1b8f000, 4, fecae844, febceee8, 21080813, 
d4224002)
cpu.generic`gcpu_mca_drain+0xa3(0, c1b8f000, c1b61e00)
errorq_drain+0xf6(c130d940)
errorq_intr+0xe(c130d940, 0)
av_dispatch_softvect+0x66(1)
dispatch_softint+0x1d(0, 0)
switch_sp_and_call+0xf(c9170ddc, fe81d2bc, 0, 0)
dosoftint+0x47(c9128d44)
do_interrupt+0x112(c9128d44, c9045fc0)
_interrupt+0xe7()
mach_cpu_idle+0xd()
>> More [<space>, <cr>, q, n, c, a] ?                                       cpu_idle+0x8e()
idle+0xde(0, 0)
thread_start+8()
[1]> ::cpuinfo
 ID ADDR     FLG NRUN BSPL PRI RNRN KRNRN SWITCH THREAD   PROC
  0 fec20a38  1f    1    0  -1   no    no t-0    c23ccde0 (idle)
  1 fec24ac8  1b    0    0 160   no    no t-23   c9170de0 sched
  2 c2b21080  1f    0    0  -1   no    no t-148  c9180de0 (idle)
  3 c91d2a80  1f    0    0  -1   no    no t-712  c9305de0 (idle)
[1]> ::memstat
Page Summary                Pages                MB  %Tot
------------     ----------------  ----------------  ----
Kernel                     240486               939   24%
Anon                         9001                35    1%
Exec and libs                3230                12    0%
Page cache                  41485               162    4%
Free (cachelist)           540744              2112   53%
Free (freelist)     2980707481099       77309411328  18387422397734060032%

Total                     1012621              3955
Physical                  1012620              3955
[1]> panic_thread/J
panic_thread:
panic_thread:   c9170de0        
[1]> c9170de0::findstack
Possible stack pointers for thread c9170de0:
  c9170430 (11)
  c9170660 (2)
  c91707b4 (6)
  c91709e0 (17)
  c9170a14 (4)
  c9170b94 (3)
  c9170cc0 (2)
[1]> c9170de0::thread
[4m    ADDR[m[4m    STATE  FLG PFLG SFLG   PRI  EPRI PIL     INTR[m[4m DISPTIME BOUND PR[m
c9170de0 onproc    809    0   13   160     0   1 c9128de0    57077    -1  2
[1]> c9170de0::thread -p
[4m    ADDR[m[4m     PROC      LWP     CRED[m
c9170de0 fec1ef10        0        0
[1]> fec1ef10::ps -flt
[4mS    PID   PPID   PGID    SID    UID      FLAGS     ADDR NAME[m
R      0      0      0      0      0 0x00000001 fec1ef10 sched
        T        t0 <TS_STOPPED>
        L      lwp0 ID: 1
[1]> fec1ef10::ptree
fec1ef10  sched
     c91cd398  fsflush
     c91cdc30  pageout
     c91ce4c8  init
          c96e74d0  ypbind
          c96e38a8  inetd
          c91ca008  nscd
          c91ca8a0  rpcbind
          d0ad0148  kcfd
          c96e5b08  syslogd
          c91ccb00  devfsadm
          c91cb9d0  syseventd
          c91cb138  svc.configd
          c91cc268  svc.startd
               d0ad1278  install-ui-start
                    d0ad09e0  install-begin
                         d0ad23a8  install-solaris
                              d0acf8b0  pfinstall
                                   c96e4140  pkginstall
[1]> 
[1]> (All Done)
(Exiting BASIC PANIC ANALYSIS)
[1]> 
 TIMEOUT
PANIC   !
RESULT=9

 Journal|OS_Installation|FAIL
1

Thu Oct 18 20:56:50 BST 2007

Merged the following fails with this ATS ...

install:n/a

Detail:

Created by user electron Thu Oct 18 02:38:15 BST 2007
!!! *** toody PANIC JOURNAL *** !!!



panic[cpu0]/thread=c2611de0: bzero: arguments below kernelbase



c2611ab8 unix:bzero+18 (0, 834)

c2611ae4 genunix:evch_evq_evzalloc+56 (824, 4)

c2611b24 genunix:sysevent_evc_alloc+86 (fead0cec, fead0cdc,)

c2611bf4 genunix:sysevent_evc_publish+d4 (f57abaa0, fead0cec,)

c2611c30 genunix:fm_ereport_post+6c (f1e8d9c0, 4)

c2611d00 cpu.generic:gcpu_ereport_post+203 (c1b8f000, 4, fecae8)

c2611d40 cpu.generic:gcpu_mca_drain+a3 (0, c1b8f000, c1b61e)

c2611d74 genunix:errorq_drain+f6 (c130d940)

c2611d84 genunix:errorq_intr+e (c130d940, 0)

c2611db0 unix:av_dispatch_softvect+66 (1)

c2611dcc unix:dispatch_softint+1d (0, 0)



panic: entering debugger (continue to save dump)

panicsys+0x329(fec01a7e, c2611ab8, fec4276c, 1)
vpanic+0xc3(fec01a7e, c2611ab8)
panic+0x12()
bzero+0x18(0, 834)
evch_evq_evzalloc+0x56(824, 4)
sysevent_evc_alloc+0x86(fead0cec, fead0cdc, c2611b50, d, 7cc, 4)
sysevent_evc_publish+0xd4(f57abaa0, fead0cec, fead0cdc, fead0cd4, fead0fb0, 
f1e8d9c0)
fm_ereport_post+0x6c(f1e8d9c0
(BASIC PANIC ANALYSIS)
, 4)
cpu.generic`gcpu_ereport_post+0x203(c1b8f000, 4, fecae844, febceee8, 21080813, 
d4224001)
cpu.generic`gcpu_mca_drain+0xa3(0, c1b8f000, c1b61e00)
errorq_drain+0xf6(c130d940)
errorq_intr+0xe(c130d940, 0)
av_dispatch_softvect+0x66(1)
dispatch_softint+0x1d(0, 0)
switch_sp_and_call+0xf(c2611ddc, fe81d2bc, 0, 0)
dosoftint+0x47(c23ccd44)
do_interrupt+0x112(c23ccd44, fec3d634)
_interrupt+0xe7()
mach_cpu_idle+0xd()
>> More [<space>, <cr>, q, n, c, a] ?                                       cpu_idle+0x8e()
>> More [<space>, <cr>, q, n, c, a] ?                                       idle+0xde(0, 0)
>> More [<space>, <cr>, q, n, c, a] ?                                       thread_start+8()
[0]>   ::status
debugging live kernel (32-bit) on toody
operating system: 5.11 snv_76 (i86pc)
CPU-specific support: AMD
DTrace state: inactive
stopped on: debugger entry trap
[0]> $c
kmdb_enter+0xa()
debug_enter+0x27(fe8d7a68)
panicsys+0x329(fec01a7e, c2611ab8, fec4276c, 1)
vpanic+0xc3(fec01a7e, c2611ab8)
panic+0x12()
bzero+0x18(0, 834)
evch_evq_evzalloc+0x56(824, 4)
sysevent_evc_alloc+0x86(fead0cec, fead0cdc, c2611b50, d, 7cc, 4)
sysevent_evc_publish+0xd4(f57abaa0, fead0cec, fead0cdc, fead0cd4, fead0fb0, 
f1e8d9c0)
fm_ereport_post+0x6c(f1e8d9c0, 4)
cpu.generic`gcpu_ereport_post+0x203(c1b8f000, 4, fecae844, febceee8, 21080813, 
d4224001)
cpu.generic`gcpu_mca_drain+0xa3(0, c1b8f000, c1b61e00)
errorq_drain+0xf6(c130d940)
errorq_intr+0xe(c130d940, 0)
av_dispatch_softvect+0x66(1)
dispatch_softint+0x1d(0, 0)
switch_sp_and_call+0xf(c2611ddc, fe81d2bc, 0, 0)
dosoftint+0x47(c23ccd44)
do_interrupt+0x112(c23ccd44, fec3d634)
_interrupt+0xe7()
mach_cpu_idle+0xd()
>> More [<space>, <cr>, q, n, c, a] ?                                       cpu_idle+0x8e()
idle+0xde(0, 0)
thread_start+8()
[0]> ::cpuinfo
 ID ADDR     FLG NRUN BSPL PRI RNRN KRNRN SWITCH THREAD   PROC
  0 fec24ac8  1b    1    0 160   no    no t-2    c2611de0 sched
  1 c2b22100  1f    1    0  -1   no    no t-1    c9128de0 (idle)
  2 c2b21080  1f    0    0  -1   no    no t-85   c9180de0 (idle)
  3 c91d2a80  1f    0    0  -1   no    no t-119  c9305de0 (idle)
[0]> ::memstat
Page Summary                Pages                MB  %Tot
------------     ----------------  ----------------  ----
Kernel                     240451               939   24%
Anon                        12910                50    1%
Exec and libs                3233                12    0%
Page cache                  41769               163    4%
Free (cachelist)           542241              2118   54%
Free (freelist)     2881923227633       73014444032  18387422397734060032%

Total                     1012621              3955
Physical                  1012620              3955
[0]> panic_thread/J
panic_thread:
panic_thread:   c2611de0        
[0]> c2611de0::findstack
Possible stack pointers for thread c2611de0:
  c261113c (3)
  c261128c (4)
  c2611374 (2)
  c2611430 (2)
  c2611498 (7)
  c2611660 (2)
  c26117b4 (6)
  c26119e0 (17)
  c2611a14 (4)
  c2611b94 (3)
  c2611cc0 (2)
[0]> c2611de0::thread
[4m    ADDR[m[4m    STATE  FLG PFLG SFLG   PRI  EPRI PIL     INTR[m[4m DISPTIME BOUND PR[m
c2611de0 onproc    809    0   13   160     0   1 c23ccde0    977b6    -1  2
[0]> c2611de0::thread -p
[4m    ADDR[m[4m     PROC      LWP     CRED[m
c2611de0 fec1ef10        0        0
[0]> fec1ef10::ps -flt
[4mS    PID   PPID   PGID    SID    UID      FLAGS     ADDR NAME[m
R      0      0      0      0      0 0x00000001 fec1ef10 sched
        T        t0 <TS_STOPPED>
        L      lwp0 ID: 1
[0]> fec1ef10::ptree
fec1ef10  sched
     c91cd398  fsflush
     c91cdc30  pageout
     c91ce4c8  init
          c91ca008  ypbind
          e58013b0  inetd
          ca3278b0  nscd
          c91bec38  rpcbind
          ca329b10  kcfd
          c91bc9d8  syslogd
          c91be3a0  devfsadm
          c91ccb00  syseventd
          c91cb138  svc.configd
          c91cc268  svc.startd
               c91bb010  install-ui-start
                    ca329278  install-begin
                         c91bb8a8  install-solaris
                              e58024e0  pfinstall
                                   e5800b18  pkginstall
[0]> 
[0]> (All Done)
(Exiting BASIC PANIC ANALYSIS)
[0]> 
 TIMEOUT
PANIC   !
RESULT=9

 Journal|OS_Installation|FAIL
1

The machine was heavily loaded with Network Storage HBAs (5 in total). At the time of install it contained the following HBAs;

Crystal-2A (QLogic QLA2342-SUN PCI/PCI-X)
Prism (x2) (Qlogic QLA210-SUN PCI/PCI-X)
Pyramid-E (Emulex LP11002 PCI/PCI-X)
Rainbow (Emulex LP10000DC-S PCI/PCI-X)

It is not known whether the hardware configuration is significant or not. The architecture of the system itself is as follows;

| toody | i86pc,Galaxy | i86pc,Opteron |


From the panic dump above, one thing that looks amiss is the amount of free memory being reported.

[0]> ::memstat
Page Summary                Pages                MB  %Tot
------------     ----------------  ----------------  ----
Kernel                     240451               939   24%
Anon                        12910                50    1%
Exec and libs                3233                12    0%
Page cache                  41769               163    4%
Free (cachelist)           542241              2118   54%
Free (freelist)     2881923227633       73014444032  18387422397734060032%


During a bugster search I noticed CR 4226895 which looks similar, though not identical
Work Around
N/A
Comments
N/A