|
Description
|
The mdprop_op code can deadlock with ioctl code trying
to resolve a devid. It looks like this race has
existed for a long time (since s10_13), but recient
changes in snv_96 issue prop_op(9E) calls more frequently
and fix some missing locking, changing timing. These
snv_96 changes are causing this problem to sometimes show
up when running tslvm (even when fixes for 6743774 and
6744223 are in place).
Race:
o One thread is in the devinfo driver, performing
a devinfo snapshot: it has an active ndi_devi_enter from
the root down. It then calls mdprop_op, which does a
md_unit_readerlock() (with active ndi_devi_enter on "/").
o Another thread may be processing an md ioctl, holding
the md_unit_writelock - and the ioctl may involve resolving
a devid, which needs to lock the device tree from the
root down (via resolve_pathname).
One approach to fixing this (while still supporting nblocks based
mdprop_op queries) would be for svm to implement a devt-to-size
hash, so that mdprop_op did not need to traverse minor node unit
structures. Maybe using something like...
static mod_hash_t *md_sizebydevt;
md_sizebydevt = mod_hash_create_idhash("md_sizebydevt", 512,
mod_hash_null_valdtor);
(void) mod_hash_insert(md_sizebydevt,
(mod_hash_key_t)(intptr_t)devt, (mod_hash_val_t)size);
if (mod_hash_find(md_sizebydevt,
(mod_hash_key_t)(intptr_t)devt, &hv) == 0)
nblocks = (size_t)hv;
else
nblocks = 0;
(void) mod_hash_remove(...
Here are details on deadlocked threads:
Thread holding md_unit_writerlock (in md_probe_one), trying to ndi_devi_enter
"/" as a result of resolving a devid:
ffffff00041d6c80 fffffffffbc36f70 0 0 60 ffffff014d276e9c
PC: _resume_from_idle+0xf1 THREAD: md`start_daemon()
stack pointer for thread ffffff00041d6c80: ffffff00041d6690
[ ffffff00041d6690 _resume_from_idle+0xf1() ]
swtch+0x221()
cv_wait+0x73(ffffff014d276e9c, ffffff014d276db0)
ndi_devi_enter+0xbe(ffffff014d276d48, ffffff00041d67e4)
devi_config_one+0x16d(ffffff014d276d48, ffffff0189848140, ffffff00041d68d8,
4080, 0)
ndi_devi_config_one+0xd8(ffffff014d276d48, ffffff0189848140,
ffffff00041d68d8, 4080)
resolve_pathname+0x16d(ffffff016e0096b8, ffffff00041d6968, 0, 0)
e_ddi_hold_devi_by_path+0x2d(ffffff016e0096b8, 0)
e_devid_cache_to_devt_list+0x2fd(ffffff017b8e8826, ffffff01731ed4c2,
ffffff00041d6af4, ffffff00041d6ad8)
ddi_lyr_devid_to_devlist+0x46(ffffff017b8e8826, ffffff01731ed4c2,
ffffff00041d6af4, ffffff00041d6ad8)
md`md_resolve_bydevid+0xf4(c7, 1b00002c02, 3)
md_raid`raid_probe_dev+0x111(ffffff0187451550, c7)
md`md_probe_one+0x80(ffffff0194a87780)
md`md_daemon+0x1ea(0, ffffffffc0645790)
md`start_daemon+0x16(ffffffffc0645790)
thread_start+8()
Thread doing devinfo snapshot: active enter of "/" and "/pseudo",
performing prop_op(9E) (NOTE: no hold of "md" node), that is
trying to md_unit_readerlock.
[1]> ffffff017b95bc00::findstack -v
stack pointer for thread ffffff017b95bc00: ffffff0004ac5780
[ ffffff0004ac5780 _resume_from_idle+0xf1() ]
ffffff0004ac57c0 swtch+0x221()
ffffff0004ac57f0 cv_wait+0x73(ffffff0187451580, ffffff0187451578)
ffffff0004ac5830 md`md_unit_readerlock_common+0x7a(ffffff0187451550, 0)
ffffff0004ac5850 md`md_unit_readerlock+0x13(ffffff0187451550)
ffffff0004ac58e0 md`mdprop_op+0xc1(55000000c7, ffffff014d2679f0, 2, 8009,
fffffffffbf3fde0, ffffff0004ac5938, ffffff0004ac595c)
ffffff0004ac5990 devinfo`di_getprop_add+0xc0(0, 1, ffffff0194a87c00,
ffffff014d2679f0, fffffffff84c5db0, fffffffffbf3fde0, 55000000c7, 1000, 0, 0,
13cf09, ffffff0004ac5a30)
ffffff0004ac5a60 devinfo`di_getprop+0x27b(0, ffffff014d267a38,
ffffff0345d72d10, ffffff0194a87c00, ffffff014d2679f0)
ffffff0004ac5ad0 devinfo`di_copynode+0x461(ffffff014d2679f0, ffffff0150cdb180
, ffffff0194a87c00)
ffffff0004ac5b30 devinfo`di_copytree+0xdb(ffffff014d276d48, ffffff0340c07020,
ffffff0194a87c00)
ffffff0004ac5be0 devinfo`di_snapshot+0x1a3(ffffff0194a87c00)
ffffff0004ac5c10 devinfo`di_snapshot_and_clean+0x1f(ffffff0194a87c00)
ffffff0004ac5ca0 devinfo`di_ioctl+0x46d(5800000002, df07, beb1e46c, 100001,
ffffff01789397b8, ffffff0004ac5e8c)
ffffff0004ac5ce0 cdev_ioctl+0x48(5800000002, df07, beb1e46c, 100001,
ffffff01789397b8, ffffff0004ac5e8c)
ffffff0004ac5d20 specfs`spec_ioctl+0x86(ffffff016cb8a300, df07, beb1e46c,
100001, ffffff01789397b8, ffffff0004ac5e8c, 0)
ffffff0004ac5da0 fop_ioctl+0x7b(ffffff016cb8a300, df07, beb1e46c, 100001,
ffffff01789397b8, ffffff0004ac5e8c, 0)
ffffff0004ac5eb0 ioctl+0x174(2c, df07, beb1e46c)
ffffff0004ac5f00 sys_syscall32+0x1fc()
[1]> ffffff014d2679f0::prtconf
DEVINFO NAME
ffffff014d276d48 i86pc (driver name: rootnex)
ffffff014d26f9e0 pseudo, instance #0 (driver name: pseudo)
ffffff014d2679f0 md, instance #0 (driver name: md)[1]>
[1]> ffffff014d2679f0::devinfo -s
DEVINFO MAJ REFCNT NODENAME NODESTATE
INST CIRCULAR BINDNAME STATE
THREAD FLAGS
ffffff014d2679f0 85 8 md@0 DS_READY
0 0 md <S_EVADD,S_NEED_RESET>
0 <>
^^^NOTE: code has fix for 6743774,6744223
(ndi_devi_exit accross prop_op(9E) call)
[1]> ffffff014d26f9e0::devinfo -s
DEVINFO MAJ REFCNT NODENAME NODESTATE
INST CIRCULAR BINDNAME STATE
THREAD FLAGS
ffffff014d26f9e0 2 110 pseudo DS_READY
0 0 pseudo <S_EVADD,S_NEED_RESET>
ffffff017b95bc00 <BUSY,MADE_CHILDREN
[1]> ffffff014d276d48::devinfo -s
DEVINFO MAJ REFCNT NODENAME NODESTATE
INST CIRCULAR BINDNAME STATE
THREAD FLAGS
ffffff014d276d48 1 19 i86pc DS_READY
-1 0 i86pc <>
ffffff017b95bc00 <BUSY,MADE_CHILDREN
Pointer to webrev that makes mdprop_op occur more
frequently and changes timing by fixing missing locks
(including locks in some devid code):
http://onnv.sfbay/log/nv/2008/07/30.cth/webrev/
|