OpenSolaris

Printable Version Enter a New Search
Bug ID 6807184
Synopsis rge driver drops off network
State 11-Closed:Duplicate (Closed)
Category:Subcategory driver:rge
Keywords opensolaris
Responsible Engineer Li-zhen You
Reported Against snv_106 , snv_111 , snv_122
Duplicate Of 6892693
Introduced In
Commit to Fix
Fixed In
Release Fixed
Related Bugs 6881257
Submit Date 18-February-2009
Last Update Date 27-November-2009
Description
Category
    kernel
Sub-Category
    network-driver
Description
    During a large file transfer, a card using the RGE driver drops off 
the network. Its not related to the hwchecksum bug (I've tried with and 
without that option in /etc/system) On 106 it happens after 25-30 gigs, 
on 101 (2008.11) it happened between 10 and 15 gb transferred. Snoop 
shows only the arp requests being sent (with no reply.) I can bring the 
card back online by unplumb/plumb, but the transfer becomes 
signifigantly slower than it was originally. If I unplug and plug the 
network cable, I see the link down link up messages in dmesg, but it has 
no effect on traffic flowing. Other network cards in the machine 
continue to work fine when this occurs.
Frequency
    Always
Regression
    Solaris 10
Steps to Reproduce
    1) copy 30gb of data to a nfs shared zfs backended share using a rge 
card
2) wait for the machine to loose connection
Expected Result
    the network card shouldn't drop off the network
Actual Result
    the network card drops off the network
Error Message(s)
Test Case
Workaround
Additional configuration information
    sum /kernel/drv/amd64/rge
4707 200 /kernel/drv/amd64/rge
modinfo | grep rge
163 fffffffff7ea9000   a9d8 110   1  rge (Realtek 1Gb Ethernet)
Asus P5QL motherboard with builtin Realtek card
pci bus 0x0002 cardnum 0x00 function 0x00: vendor 0x10ec device 0x8168
Realtek Semiconductor Co., Ltd. RTL8111/8168B PCI Express Gigabit 
Ethernet controller
Work Around
N/A
Comments
From Masayuki Murayama:

 The latest rge will be below, which I made to test the fix for 6892693.
 Would you try it first?  Please ensure if the performance is not degraded too.
http://homepage2.nifty.com/mrym3/taiyodo/rge.mcast.3.tar.gz

To load the new rge driver into kernel:

(1) unload existing rge:
   unplumb rge port
   # ifconfig rge0 unplumb

   find module id of rge
   # modinfo | grep rge
   if the result is:
200 fffffffff889e000   d420 320   1  rge (Realtek 1Gb Ethernet)
   then,
   # modunload -i 200

 (2) load the new rge
  if you use 64bit kernel:
  # modload ./amd64/rge

  if you use 32bit kernel:
  # modload ./i386/rge

  ensure the new rge is loaded and running
  # modinfo | grep rge
200 fffffffff889e000   d420 320   1  rge (Realtek 1Gb Ethernet mcast.3)

 (3) plumb the new rge
  # ifconfig rge0 plumb .......

 (4) then test your applications.
Anders added the following comment to the openSolaris Bugzilla bug:

"Please update 6807184 in bugster with the following information:

http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6807184

> > The latest rge will be below, which I made to test the fix for 6892693.
> >
> >Would you try it first?  Please ensure if the performance is not degraded too.
> >http://homepage2.nifty.com/mrym3/taiyodo/rge.mcast.3.tar.gz

I've verified that the bug occurs in rge from opensolaris 0906 and does no
longer occur in your rge, at least on my osol 0906 box.

Verification of "exists": I've created a zvol with shareiscsi=on, used an Apple
iMac (running 10.6.2) as iscsi initiator and used Helios LanTest using the
storage to perform some small disk benchmarks onto the share, including writing
and reading a 3 GB file from the share. During the third loop of this
benchmark, the transfers stalled.

Verification using the new rge driver: Same as aboved, but the benchmark loop
has been running for a couple of hours without any problems. The benchmark has
been running 40 times for now, so I assume that the issue has been fixed in the
new rge driver.

My testing equipment is limited; I'm seeing a constant transfer rate of 47 MB/s
for read and 26 MB/s for write benchmarks on that specific link between my two
boxes, so I don't see much issues yet and don't complain for around 400 Mbit on
a 7 euro NIC  :-) 

Regards,

Anders"