S10 update4 build12 system panic'ed when running load_unload test against xge interface on a pair of V40z systems.
-bash-3.00# uname -a
SunOS waxe 5.10 Generic_120012-13 i86pc i386 i86pc
-bash-3.00# cat /etc/release
Solaris 10 8/07 s10x_u4wos_12 X86
Copyright 2007 Sun Microsystems, Inc. All Rights Reserved.
Use is subject to license terms.
Assembled 24 July 2007
-bash-3.00#panic[cpu0]/thread=fffffe80001a9c80:
BAD TRAP: type=e (#pf Page fault) rp=fffffe80001a94e0 addr=ffffffff900dd000
sched:
#pf Page fault
Bad kernel fault at addr=0xffffffff900dd000
pid=0, pc=0xfffffffffb829b1a, sp=0xfffffe80001a95d8, eflags=0x10213
cr0: 8005003b<pg,wp,ne,et,ts,mp,pe> cr4: 6f0<xmme,fxsr,pge,mce,pae,pse>
cr2: ffffffff900dd000 cr3: 12c51000 cr8: c
rdi: ffffffff900dcffa rsi: fffffe8f4b594c68 rdx: 5b4
rcx: 46 r8: 5a8 r9: 26
rax: fffffe8f4b594e9c rbx: 5b4 rbp: fffffe80001a9610
r10: 0 r11: 1 r12: fffffe86073223a0
r13: 5b4 r14: ffffffff900dd22e r15: ffffffff908f8b78
fsb: ffffffff80000000 gsb: fffffffffbc25460 ds: 43
es: 43 fs: 0 gs: 1c3
trp: e err: 2 rip: fffffffffb829b1a
cs: 28 rfl: 10213 rsp: fffffe80001a95d8
ss: 30
fffffe80001a93f0 unix:real_mode_end+7051 ()
fffffe80001a94d0 unix:trap+d86 ()
fffffe80001a94e0 unix:cmntrap+13f ()
fffffe80001a9610 unix:bcopy+a ()
fffffe80001a9670 bge:bge_send+4e ()
fffffe80001a96a0 bge:bge_m_tx+8d ()
fffffe80001a96b0 dls:dls_tx+e ()
fffffe80001a96d0 dld:dld_tx_single+1f ()
fffffe80001a96f0 dld:str_mdata_fastpath_put+40 ()
fffffe80001a9780 ip:tcp_lsosend_data+350 ()
fffffe80001a9840 ip:tcp_send+5f5 ()
fffffe80001a9900 ip:tcp_wput_data+471 ()
fffffe80001a9a90 ip:tcp_rput_data+133e ()
fffffe80001a9ad0 ip:squeue_enter_chain+16e ()
fffffe80001a9bd0 ip:ip_input+b20 ()
fffffe80001a9c10 dls:soft_ring_drain+98 ()
fffffe80001a9c60 dls:soft_ring_worker+db ()
fffffe80001a9c70 unix:thread_start+8 ()
panic[cpu0]/thread=fffffe80001a9c80:
BAD TRAP: type=e (#pf Page fault) rp=fffffffffbc4bde0 addr=0 occurred in module
"<unknown>" due to a NULL pointer dereference
syncing file systems...
done
dumping to /dev/dsk/c1t0d0s1, offset 859111424, content: kernel>
>> $c
bcopy+0xa()
bge_send+0x4e()
bge_m_tx+0x8d()
dls_tx+0xe()
dld_tx_single+0x1f()
str_mdata_fastpath_put+0x40()
tcp_lsosend_data+0x350()
tcp_send+0x5f5()
tcp_wput_data+0x471()
tcp_rput_data+0x133e()
squeue_enter_chain+0x16e()
ip_input+0xb20()
soft_ring_drain+0x98()
soft_ring_worker+0xdb()
thread_start+8()
> ::system
set ip_squeue_soft_ring=0x1 [0t1]
set kmem_flags=0xf [0t15]>
I do not run the same test on Nevada due to lack of test environment. It's interesting that tcp_lsosend_data() was called when sending packets through bge interface for bge does not support LSO so far.
Please see the coredumps at /net/greatwall.prc/users/xw161283/coredump/CR6586787.
Please run the same test on Nevada and file a bug if it's also there. The bug exists in both gates need to be fixed in Nevada first.