OpenSolaris

Printable Version Enter a New Search
Bug ID 6745259
Synopsis Calling ddi_remove_minor_node in async thread causes deadlock
State 10-Fix Delivered (Fix available in build)
Category:Subcategory ib_sw:ibtl
Keywords
Responsible Engineer Rajkumar Sivaprakasam
Reported Against snv_96
Duplicate Of
Introduced In solaris_nevada
Commit to Fix snv_104
Fixed In snv_104
Release Fixed solaris_nevada(snv_104) , solaris_10u8(s10u8_02) (Bug ID:2169397)
Related Bugs 6276452
Submit Date 5-September-2008
Last Update Date 4-December-2008
Description
When a HCA is DR'ed out by doing cfgadm -c unconfigure <hca apid>, the detach of the HCA is called with ndi_devi_enter on both PHCI dip (HCA's dip) and the VHCI dip (IB Nexus dip). the HCA detach routine will call ibc_pre_detach which will call back all the IB clients that have opened the HCA in a async thread.

	IBDM is one of the clients that gets called in the async thread and it calls back into ibnex via ibnex_dm_callback. This routine does a ddi_remove_minor_node of the HCA guid node created for this in the IB Nexus device tree.

	The ddi_remove_minor_node used to do i_devi_enter on the parent dip (IB Nexus dip) before removing the minor node. Now, due to the changes for CR 6276452, ddi_remove_minor_node has been modified to do ndi_devi_enter instead of i_devi_enter. Since the second ndi_devi_enter is done from a different thread (taskq async thread) than the one which did the first ndi_devi_enter (cfgadm thread) the async thread waits for the first thread to do ndi_devi_exit. The first thread waits for all the call back async threads to complete leading to a simple 2 way deadlock.
Work Around
N/A
Comments
N/A