OpenSolaris

Printable Version Enter a New Search
Bug ID 6311743
Synopsis callout table lock contention in timeout and untimeout
State 10-Fix Delivered (Fix available in build)
Category:Subcategory kernel:other
Keywords SAE | batoka-perf | kt-scalability | onnv_triage | opl-perf | ostrack | pae-networking | rtiq_reviewed | s10u7RR-waived | spbc_s10X | sps-scale | spsw-see-s10u8-goal
Responsible Engineer Madhavan Venkataraman
Reported Against snv_19 , snv_76 , s10u4_09 , solaris_10
Duplicate Of
Introduced In solaris_2.0
Commit to Fix snv_103
Fixed In snv_103
Release Fixed solaris_nevada(snv_103) , solaris_10u8(s10u8_04) (Bug ID:2169382)
Related Bugs 4208778 , 6565503 , 6572462 , 6586066 , 6627390 , 6693595 , 6699662 , 6775849 , 6781932 , 6784948 , 6809548 , 6811294 , 6822357 , 6827248 , 5074795
Submit Date 16-August-2005
Last Update Date 12-June-2009
Description
CALLOUT_FANOUT_BITS should be scaled up for larger systems to avoid lock contention in timeout_common(), untimeout() et. al.
Scaling of TCP Receive on Batoka with number of connections is limited by this bug.

We run uperf for a TCP Receive test on a Batoka populated with 4 nxge ports on different PCI-E root complex. 

Message size=64K, Window Size=256K, Number connections=400, Number clients =8, 2 connected to each nxge port. MTU=1500 bytes.

# Connections 	 Throughput (Gbps) 	 CPU Utilization (usr/sys/idle)
8 			6.56 			0/6/93
64 			24.85 			0/32/67
400 			13.456 			1/52/47
1000 			9.352 			0/56/44 

Analyzing lockstat at 400 connections gives us:

Adaptive mutex spin: 9227825 events in 10.302 seconds (895755 events/sec)

Count indv cuml rcnt     spin Lock                   Caller
-------------------------------------------------------------------------------
569235   6%   6% 0.00       17 0x6009a065000          timeout_common+0x4
566645   6%  12% 0.00       18 0x6009a065000          untimeout+0x24
558153   6%  18% 0.00       18 0x6009a053000          timeout_common+0x4
557459   6%  24% 0.00       18 0x6009a05c000          timeout_common+0x4
557042   6%  30% 0.00       19 0x6009a050000          timeout_common+0x4
556307   6%  36% 0.00       19 0x6009a059000          untimeout+0x24
556123   6%  42% 0.00       19 0x6009a053000          untimeout+0x24
554562   6%  49% 0.00       19 0x6009a04a000          timeout_common+0x4
553372   6%  54% 0.00       19 0x6009a056000          untimeout+0x24
552478   6%  60% 0.00       19 0x6009a05c000          untimeout+0x24
551271   6%  66% 0.00       19 0x6009a059000          timeout_common+0x4
551227   6%  72% 0.00       20 0x6009a050000          untimeout+0x24
549697   6%  78% 0.00       19 0x6009a056000          timeout_common+0x4
548408   6%  84% 0.00       19 0x6009a05f000          untimeout+0x24
546500   6%  90% 0.00       18 0x6009a05f000          timeout_common+0x4
545534   6%  96% 0.00       20 0x6009a04a000          untimeout+0x24

showing severe contention on tha callouts.
Work Around
N/A
Comments
The Tickless Callout project fixes the problem in a generic way
by using per-CPU tables, per-CPU programmable and migratable cyclics,
and an event-based callout handling approach that does not require the
cyclics to go off every tick and poll for expired callouts. The per-CPU
approach solves the lock contention issue.