OpenSolaris

Printable Version Enter a New Search
Bug ID 6564094
Synopsis BAD TRAP panic in aio library while running global_dev/tc_aio_read assertion 13
State 10-Fix Delivered (Fix available in build)
Category:Subcategory library:libaio
Keywords no-s10 | osc | scnv | test-stopper
Responsible Engineer Prakash Sangappa
Reported Against snv_64 , snv_65 , snv_64a , solaris_10u4
Duplicate Of
Introduced In solaris_10
Commit to Fix snv_68
Fixed In snv_68
Release Fixed solaris_nevada(snv_68) , solaris_10u7(s10u7_06) (Bug ID:2171510)
Related Bugs 6587965 , 6648760
Submit Date 31-May-2007
Last Update Date 14-January-2009
Description
cores will be copied to /net/coresvr.sfbay/export/cores4/bugid.

panic[cpu1]/thread=fffffffec1d16280: 
BAD TRAP: type=e (#pf Page fault) rp=ffffff0004522b80 addr=28 occurred in module
 "genunix" due to a NULL pointer dereference


tc_aio_read: 
#pf Page fault
Bad kernel fault at addr=0x28
pid=3080, pc=0xfffffffffb93744d, sp=0xffffff0004522c70, eflags=0x10207
cr0: 8005003b<pg,wp,ne,et,ts,mp,pe> cr4: 6f8<xmme,fxsr,pge,mce,pae,pse,de>
cr2: 28 cr3: 3f11b000 cr8: c
        rdi: fffffffeceb73028 rsi: fffffffec6bf9e00 rdx: fffffffec1d16280
        rcx:                0  r8:                0  r9:                0
        rax:                0 rbx:                1 rbp: ffffff0004522c80
        r10: fffffffec1e5ca20 r11:                4 r12:               16
        r13: fffffffeceb73060 r14:           100000 r15: fffffffeceb73000
        fsb:                0 gsb: fffffffec1692800  ds:               4b
         es:               4b  fs:                0  gs:              1c3
        trp:                e err:                2 rip: fffffffffb93744d
         cs:               30 rfl:            10207 rsp: ffffff0004522c70
         ss:               38

ffffff0004522a60 unix:die+c8 ()
ffffff0004522b70 unix:trap+135b ()
ffffff0004522b80 unix:cmntrap+e9 ()
ffffff0004522c80 genunix:aio_deq+1d ()
ffffff0004522e40 kaio:aiorw+4b1 ()
ffffff0004522e80 kaio:kaio+212 ()     
ffffff0004522ec0 genunix:syscall_ap+8f ()
ffffff0004522f10 unix:brand_sys_syscall32+1a3 ()

syncing file systems...
 4
 1
 done
dumping to /dev/dsk/c2t0d0s1, offset 431030272, content: kernel

Mail from an SC engineer who looked at the cores.

============================================================================

This looks like a Solaris aio bug. I'd file a bug against Solaris.

The console panic trace seems to suggest a brandz syscall is triggering the panic, but closer inspection shows that it is generic aio call:

 > ::status
debugging crash dump vmcore.1 (64-bit) from pkuda1
operating system: 5.11 snv_64a (i86pc)
panic message:
BAD TRAP: type=e (#pf Page fault) rp=ffffff00041d6b80 addr=28 occurred in module
 "genunix" due to a NULL pointer dereference
dump content: kernel pages only
 > $C
ffffff00041d6c80 aio_deq+0x1d(fffffffed1aec028, fffffffec64f9a80)
ffffff00041d6e40 aiorw+0x4b1(b, 8047510, 1, 1)
ffffff00041d6e80 kaio+0x212(fffffffec1c7f918, ffffff00041d6eb8)
ffffff00041d6ec0 syscall_ap+0x8f()
ffffff00041d6f10 sys_syscall32+0x101()
The issue is noted during pxfs related automated stress tests. Since we don't do any tests without cluster s/w installed, we cannot say if it happens outside of cluster as well. 

Running the pxfs tests from scate should result in panic at some point.
Work Around
N/A
Comments
N/A