|
Description
|
I had a machine hang while running tests (modbash/devicesbash).
It looks like the hang occured because the following
thread hung while detaching and holding it's parent devinfo
node /devices/pseudo busy.
[1]> 2a100f75cc0::findstack -v
stack pointer for thread 2a100f75cc0: 2a100f74ba1
[ 000002a100f74ba1 cv_wait+0x38() ]
000002a100f74c51 i_devi_enter+0x30(300007db448, 400000, 400000, 1, 300007db59c, 0)
000002a100f74d01 ddi_remove_minor_node+0x20(300007db4b0, 0, 2, 300007db448, 7b3619a8, 0)
000002a100f74db1 detach_node+0x12c(18a8000, 8000000, 0, 10420000, 300007db4b0, 300007db448)
000002a100f74e61 i_ndi_unconfig_node+0x110(300007db448, 11c, 8000000, 10ac08c, 14, 10ac000)
000002a100f74f11 i_ddi_detachchild+0x20(300007db448, 8000000, 1834340, 300003e2938, 1000, 2a100f75cc6)
000002a100f74fd1 devi_detach_node+0x6c(300007db448, 8000000, 0, 300000779c8, 80000, 8000000)
000002a100f75091 unconfig_immediate_children+0x98(300003e2938, 0, 300007db448, f4, 2000, 8000000)
000002a100f75151 devi_unconfig_common+0x1a8(300003e2938, 0, 6, 0, 0, f4)
000002a100f75211 mt_config_thread+0xac(3001cb29bc0, 0, 1834340, 1834340, 300003e2938, 30000f0b600)
000002a100f752d1 thread_start+4(3001cb29bc0, 0, ca5a202092100008,
d00da030a401401b, e4726020d02a6030, c85da02086210017)
[1]> 300007db448::devinfo -s
DEVINFO MAJ REFCNT NODENAME NODESTATE
INST CIRCULAR BINDNAME STATE
THREAD FLAGS
00000300007db448 244 0 fcip@0 DS_ATTACH
0 0 fcip <S_DETACHING
,S_MD_UPDATE,S_EVADD>
0 <>
[1]> 00000300007db448::print struct dev_info devi_state
devi_state = 0x10420000 <<< DEVI_S_MD_UPDATE set
[1]>
[1]> 300007db448::prtconf
DEVINFO NAME
300003ebd08 SUNW,Ultra-4
300003e2938 pseudo, instance #0
300007db448 fcip, instance #0
[1]>
I have attached a core taken by breaking into the
debugger are forcing a core while the machine was hung.
A possible culprit would be manipulation of devi_state
without holding devi_lock.
This failure occured on springfield.central
xxxxx Enterprise 450 (2 X UltraSPARC-II 248MHz), No Keyboard
OpenBoot 3.26, 256 MB memory installed, Serial #12876880.
It seems like an evaluation of all DEVI_SET_* DEVI_CLR_*
calles is needed - it is not aparent (to me) that
mutex_owned(&(DEVI(dip)->devi_lock) is being used
consistently to protect devi_state manipulation.
Also, some usb code does not take devi_lock prior to SET/CLR.
xxxxx@xxxxx.com 2005-1-01 00:15:15 GMT
I have had this hang occur two more times (after ~8 hours of testing)
on fatboy.central (16way sun4u) with modbash/devicesbash testing.
|