OpenSolaris

Printable Version Enter a New Search
Bug ID 6619224
Synopsis Tick accounting needs to be made scalable
State 11-Closed:Verified (Closed)
Category:Subcategory kernel:other
Keywords batoka-perf | kt-scalability | opl-jupiter | opl-perf | opl-rn | spbc_s10X | sps-scale
Responsible Engineer Madhavan Venkataraman
Reported Against
Duplicate Of
Introduced In solaris_2.0
Commit to Fix snv_81
Fixed In snv_81
Release Fixed solaris_nevada(snv_81) , solaris_10u6(s10u6_03) (Bug ID:2158895)
Related Bugs 2132272 , 6510779 , 6569264 , 6582502 , 6633823 , 6636045 , 6662878 , 6748342 , 6791966 , 6810110 , 6811269 , 6820594 , 6901289
Submit Date 19-October-2007
Last Update Date 12-January-2009
Description
Solaris performs some accounting and bookkeeping activities every
clock tick. To do this, a cyclic timer is created to go off every
clock tick and call a clock handler (clock()). This handler performs,
among other things, tick accounting for active threads.

Every tick, the tick accounting code in clock() goes around all the
active CPUs in the system, determines if any user thread is running
on a CPU and charges it with one tick. This is used to measure the
number of ticks a user thread is using of CPU time. This also goes
towards the time quantum used by a thread. Dispatching decisions are
made using this. Finally, the LWP interval timers (virtual and profiling
timers) are processed every tick, if they have been set.

As the number of CPUs increases, the tick accounting loop gets larger.
Since only one CPU is engaged in doing this, this is also single-threaded.
This makes tick accounting not scalable. On a busy system with many CPUs,
the tick accounting loop alone can often take more than a tick to process
if the locks it needs to acquire are busy. This causes the invocations of
the clock() handler to drift in time. Consequently, the lbolt drifts. So,
any timing based on the lbolt becomes inaccurate. Any computations based
on the lbolt (such as load averages) also get skewed.

We need to make tick accounting more scalable and reduce the impact of it
on the clock() handler.
There is work in progress to address 6582502. The idea is to eliminate ts_tick() altogether and drive all
scheduling from timeout(). There is still an issue of scalability of timeout() imlementation, so this
fix may be still applicable in some form.
Work Around
N/A
Comments
N/A