OpenSolaris

Printable Version Enter a New Search
Bug ID 6840063
Synopsis usbsacm stops sending data out when pushed hard
State 10-Fix Delivered (Fix available in build)
Category:Subcategory driver:usbsacm
Keywords opensolaris
Responsible Engineer Raymond Chen
Reported Against snv_90 , snv_111
Duplicate Of
Introduced In solaris_10
Commit to Fix snv_120
Fixed In snv_120
Release Fixed solaris_nevada(snv_120)
Related Bugs 6588968 , 6719062
Submit Date 12-May-2009
Last Update Date 31-July-2009
Description
Category
   solaris/ppp (Solaris Networking)
Sub-Category
   ppp_kernel
Description
   After connecting using ppp, sppp0 is established, and works to a point. I can ping, and do some generally low bandwidth things, like ssh or telnet. I can even load some web pages.
But - as soon as I attempt to load a number of pages at the one time, the ppp interface stops working.
In addition to this, once it's stopped working, an ifconfig -a will hang part way through displaying the sppp0 interface entry.
I have verified using an external host that when the issue occurs, we are indeed no longer sending packets out. 
For what it's worth, I'm more than happy to collect whatever data you might like me to collect.
Frequency
   Always
Regression
   No
Steps to Reproduce
   Start ppp interface, connect to theregister.co.uk and 'open in new tab' a bunch of the stories. by the time you have started opening the 6th tab, it'll be stopped.
 If there is an already running snoop, it can been seen that there are still requests going out from snoops point if view, but no return packets.
Expected Result
   One might expect that it would actually keep working and display the requested pages.
Actual Result
   sppp0 interface stops working, thus, webpages etc all stop responding. 
To restart the interface, one needs to pkill pppd (at times with a -9), possibly clean up interfaces with ifconfig unplumb then re-connect. 
Error Message(s)
   No error messages as such - Things just stop progressing. 
Test Case
   N/A
Workaround
   
Additional configuration information
   Opensolaris 2008.11, updated to kernel 111a using pkg image-update from pkg.opensolaris.org/dev
Nokia 6310 USB connected phone
Phone, cable and service provider demonstrated to work fine on a different laptop running Nevada 110.
I thought it might have been a hardware issue, however, I first encountered this issue on an HP tx2-1015au, which is ATI chipset, AMD cpu, and have since encountered it on an HP (compaq) 2510p, which is an all intel affair.
More oddly, it seems to work just fine in real nevada (sxce) build 110 on an MSI Megabook S270, but not in OpenSolaris on either of my HP units.
PPP is stuck because USB is stuck.  Looking at the stream in the
attached system dump, I see this:

> 0xffffff014f8ef140::queue
            ADDR MODULE         FLAGS NBLK
ffffff014f8ef140 usbsacm       244020  449 ffffff015dec6700

There are hundreds of messages on the queue, and nothing is moving.
Looking at the flags and transmit state on the driver, I see this:

> 0xffffff0154266900::print usbser_port_t port_state port_flags port_wq_data_cnt port_wq_thread port_flowc
port_state = 0x4
port_flags = 0x44
port_wq_data_cnt = 0xfc78
{
    port_wq_thread.thr_cv = {
        _opaque = 0x1
    }
    port_wq_thread.thr_flags = 0x1
    port_wq_thread.thr_port = 0xffffff0154266900
    port_wq_thread.thr_func = usbser_wq_thread
    port_wq_thread.thr_arg = 0xffffff0154266958
}
port_flowc = 0

In other words, the port is open, no flow control has been asserted,
and the write queue thread is just sleeping away.  I have to guess
that either the hardware itself is broken and has simply stopped
transmitting, or that the complex thread wakeup logic in usbser.c
("GSD") is broken.

I suspect the latter, as there've been *many* reports now of people
with stuck USB serial ports, and it seems unlikely that lots of people
all have defective gear.

Plus, the same hardware is known to work on other systems.

Thus, transferring over to USB team to investigate.
Work Around
There's report that if mulit-core is disabled(see psradm(1M)), there will be no hang. I don't have chance to verify it. But this could be a choice before the root cause is found.
Comments
N/A