OpenSolaris

Printable Version Enter a New Search
Bug ID 6600688
Synopsis potential suspend/resume race in rge
State 10-Fix Delivered (Fix available in build)
Category:Subcategory driver:rge
Keywords
Responsible Engineer Min Xu
Reported Against
Duplicate Of
Introduced In solaris_nevada
Commit to Fix snv_92
Fixed In snv_92
Release Fixed solaris_nevada(snv_92)
Related Bugs 6247110
Submit Date 4-September-2007
Last Update Date 18-June-2008
Description
I was reviewing the code for rge suspend/resume, and I notice that the SUSPEND logic does not take care to ensure that that the Nemo entry points are prevented from touching the hardware while the device is suspended/suspending.

Note that kernel threads are the *last* thing to be stopped, not devices.  So it is important to ensure that a kernel thread can't wind up accessing the device after it has supposedly been suspended.  This is especially true for the transmit routine, but it applies to all asynchronous entry points, timeouts, cyclics, etc.

The canonical way to do this is to set a flag field, protected by one or more locks, indicating that the device is suspeneded.  Then, anywhere that might touch the hardware, has to check this flag under one of those locks.

For m_promisc, for example, what I do is set the device's boolean "promisc" flag to true, but then have something like "if (!statep->suspended) { program hardware }".  This allows the function to complete, and the device will be properly programmed on resume.

For transmit (m_tx()) what I usually do is just return the mblk list to the caller, and then call mac_tx_update() at the end of my resume processing.

Have a look at the code in rtls (closed source) to see how to do it.
Work Around
N/A
Comments
N/A