|
Description
|
This bug is seen in osol and it used to be tracked under bugzilla:
http://defect.opensolaris.org/bz/show_bug.cgi?id=6630
Now we're seeing this on a X8420 blade (oaf602) - which has four e1000g nics.
Loading kmdb...
SunOS Release 5.11 Version snv_111 64-bit
Copyright 1983-2009 Sun Microsystems, Inc. All rights reserved.
Use is subject to license terms.
[.. Hang ..]
Welcome to kmdb
kmdb: unable to determine terminal type: assuming `vt100'
Loaded modules: [ scsi_vhci mac uppc neti sd ufs unix cpu_ms.AuthenticAMD.15
krtld s1394 uhci hook genunix ip usba specfs pcplusmp cpu.generic sctp arp
sockfs ]
[0]> ::ptree
fffffffffbc2c030 sched
ffffff01d3274a48 fsflush
ffffff01d32756a8 pageout
ffffff01d3276308 init
ffffff01d3270008 dlmgmtd
ffffff01d3272528 svc.configd
ffffff01d3273188 svc.startd
ffffff01d3273de8 net-physical
ffffff01d326e6b0 netstrategy
[0]> :c
According to Sean, this is also seen on x4600 with the following configuration:
with osol_0906-109 its still hanging around the same place.
Some more investigation shows it could be to do with the network
interfaces on this box.
booting again gets us here:
.
..
installing namefs, module id 153.
load 'sys/portfs' id 154 loaded @ 0xfffffffff7ed6000/0xffffffffc004bfd0 size
28032/304
installing portfs, module id 154.
Booting to milestone "milestone/single-user:default".
load 'exec/intpexec' id 155 loaded @ 0xfffffffff7e659b0/0xffffffffc0040a48 size
1456/136
installing intpexec, module id 155.
load 'drv/sysevent' id 156 loaded @ 0xfffffffff7e233e8/0xffffffffc004c100 size
4448/368
installing sysevent, module id 156.
/pci@0,0/pci108e,cb84@2/storage@4/disk@0,0 (sd0) online
at this point the last process running was netstrategy:
[4]> ::ptree
fffffffffbc2ba70 sched
ffffff08ef051a48 fsflush
ffffff08ef0526a8 pageout
ffffff08ef053308 init
ffffff08ef04dc68 dlmgmtd
ffffff08ef050de8 svc.configd
ffffff08ef050188 svc.startd
ffffff08ef04b6b0 net-physical
ffffff08ef04aa50 netstrategy
[4]> ::ps
S PID PPID PGID SID UID FLAGS ADDR NAME
R 0 0 0 0 0 0x00000001 fffffffffbc2ba70 sched
R 3 0 0 0 0 0x00020001 ffffff08ef051a48 fsflush
R 2 0 0 0 0 0x00020001 ffffff08ef0526a8 pageout
R 1 0 0 0 0 0x4a004000 ffffff08ef053308 init
R 16 1 16 16 15 0x42000000 ffffff08ef04dc68 dlmgmtd
R 9 1 9 9 0 0x42000000 ffffff08ef050de8 svc.configd
R 7 1 7 7 0 0x42000000 ffffff08ef050188 svc.startd
R 17 7 7 7 0 0x42014000 ffffff08ef04b6b0 net-physical
R 19 17 7 7 0 0x4a004000 ffffff08ef04aa50 netstrategy
and netstrategy seems to be waiting for a nic to come back:
[4]> 0t19::pid2proc | ::walk thread | ::findstack
stack pointer for thread ffffff08ef6b1a80: ffffff003ca26520
[ ffffff003ca26520 _resume_from_idle+0xf1() ]
ffffff003ca26550 swtch+0x160()
ffffff003ca265b0 cv_wait_sig+0x14b()
ffffff003ca26610 str_cv_wait+0xbc()
ffffff003ca266c0 strwaitq+0x1fe()
ffffff003ca267d0 kstrgetmsg+0x3dc()
ffffff003ca26820 ldi_getmsg+0x9b()
ffffff003ca268b0 dl_op+0x63()
ffffff003ca26910 dl_bind+0x8f()
ffffff003ca26970 strplumb`getmacaddr+0xec()
ffffff003ca269c0 strplumb`matchmac+0x87()
ffffff003ca26a30 walk_devs+0x4f()
ffffff003ca26aa0 walk_devs+0xff()
[4]>
[4]> ffffff003ca26970-10
0xffffff003ca26960: 0xffffff003ca269880xffffff08e87f7138
0xffffff003ca269c0strplumb`matchmac+0x87
[4]> 0xffffff08e87f7138 ::whatis
ffffff08e87f7138 is ffffff08e87f7138+0, allocated from dev_info_node_cache
[4]> 0xffffff08e87f7138 ::print -t struct dev_info
{
struct dev_info *devi_parent = 0xffffff08e0a44ae0
struct dev_info *devi_child = 0
struct dev_info *devi_sibling = 0xffffff08e87f6ec8
char *devi_binding_name = 0xffffff08e0bf4ac5 "pciex8086,105e"
char *devi_addr = 0xffffff08ea30ee00 "0"
int devi_nodeid = 0x3a
int devi_instance = 0
struct dev_ops *devi_ops = e1000g`ws_ops
void *devi_parent_data = 0xffffff08e8c3a000
void *devi_driver_data = 0xffffff08e0bb6000
ddi_prop_t *devi_drv_prop_ptr = 0xffffff08ea43d5f8
ddi_prop_t *devi_sys_prop_ptr = 0
struct ddi_minor_data *devi_minor = 0xffffff08e8c53380
struct dev_info *devi_next = 0xffffff08e87f6ec8
kmutex_t devi_lock = {
void *[1] _opaque = [ 0 ]
}
.
.
.
so its waiting for a response from a e1000g nic.
This x4600 has 8 x e1000g, 1 x ixgb and 2 x nxge nics in it, its a heavy
networking rig:
dladm show-phys from snv_108:
LINK MEDIA STATE SPEED DUPLEX DEVICE
e1000g4 Ethernet up 1000 full e1000g4
nxge0 Ethernet up 10000 full nxge0
e1000g1 Ethernet up 1000 full e1000g1
e1000g5 Ethernet up 1000 full e1000g5
e1000g0 Ethernet up 1000 full e1000g0
ixgb0 Ethernet up 10000 full ixgb0
e1000g2 Ethernet up 1000 full e1000g2
e1000g6 Ethernet up 1000 full e1000g6
e1000g3 Ethernet up 1000 full e1000g3
e1000g7 Ethernet up 1000 full e1000g7
nxge1 Ethernet unknown 0 unknown nxge1
|