OpenSolaris

Printable Version Enter a New Search
Bug ID 6588256
Synopsis HSFS performance needs a boost
State 10-Fix Delivered (Fix available in build)
Category:Subcategory kernel:hsfs
Keywords rtiq_reviewed
Responsible Engineer Moinak Ghosh
Reported Against
Duplicate Of
Introduced In solaris_2.0
Commit to Fix snv_77
Fixed In snv_77
Release Fixed solaris_nevada(snv_77)
Related Bugs 6621609 , 6627899
Submit Date 2-August-2007
Last Update Date 31-December-2007
Description
HSFS currently in Solaris performs poorly with lots of small
reads taking place and no readahead. There is a lot of scope
to improve performance. The max throughput that can be achieved
today even with the fastest data DVDs and straight sequential
read is a meagre 3.4 MB/s.

In addition these changes discussed below are vital for building
Live bootable CDs and DVDs as they reduce the bootup time of
Solaris when booting from CD/DVD - an opensolaris LiveCD becomes
practical. With Solaris moving towards a LiveCD based installer,
this is needed.

HSFS Performance Enhancements
-----------------------------

The HSFS filesystem module is Solaris performs somewhat slower
than the competition like Linux. CD performance is especially
important not only for multimedia apps but also for Live bootable
CDs which is the direction being taken with the Indiana project.

With these requirements in mind a couple of enhancements were
done to the HSFS implementation in OpenSolaris:

   1) Addition of an I/O Scheduler
   2) Read-Ahead

The hsfs implementation in Solaris suffers for quite a bit of
drive head seeking and numerous read requests being issued
in small 2K chunks. Also no read-ahead is performed. The above
enhancements aim to reduce these problems.

The I/O scheduler implements an Elevator algorithm. The read
requests are sorted as per the the logical block number and
issued to the device in that order. In addition the scheduler
attempts to check if subsequent requests are adjacent to each
other. If so then it will merge those requests into a single
larger request, which is then delivered to the drive. The
algorithm starts scanning I/O requests in ascending order of
logical block number till no more higher numbered requests
remain at which time it starts again from the lowest numbered
read request if any. This is thus a 1-way forward merge
Elevator also called as Circular Look. This algorithm is
suitable for CD/DVD media as these media have a single outward
spiralling circular track with poor random-seek behavior.

In addition the I/O scheduler also checks for deadline
expiration to prevent starvation. A deadline of 500ms
is used for reads.

The read-ahead obviously attempts to detect sequential file
access pattern and "warms" the page cache by preloading pages
of data before the application requests for them. This is
benefitial as most scenarios of CD/DVD usage represent typical
single-threaded sequential access behavior. The sequential
pattern detection is not perfect however and can be ineffective
in the face of multiple threads reading the same file. But as
mentioned already this is not a big issue for general CD/DVD
usage. Also, so as not to swamp the page cache, the UFS
freebehind logic is being used here, see:
http://monaco.sfbay/detail.jsf?cr=6207772

Sequential access is assumed if there are more than 2 read
requests in ascending order and adjacent to each other. The
read-ahead logic then slowly ramps up faulting in additional
pages. The number of pages preloaded is increased with every
subsequent sequential read upto a maximum of 4. In addition
read-ahead is performed only if a successful cache-hit of an
earlier faulted page occurrs. This behavior has the tendency to
throttle read-ahead in case of cache misses, since that will
indicate memory pressure and caching inefficiency. So 
preloading pages will mean waste of bandwidth. Also the read-
ahead count is decremented with every non-sequential access.
As mentioned above read-ahead is only issued on a cache-hit, 
but with the additional condition that the subsequent page
is not already in the cache. So in the ideal case this behavior
will mean that the application never waits for disk I/O.

Another side benefit of doing read-ahead in sequential access
is that it provides enough meat to the I/O scheduler so that
it can optimize and coalesce the subsequent reads achieving
higher throughput. A application reading one or two pages at
a time does not exercise the I/O scheduler much.

Code Overview
-------------

usr/src/uts/common/fs/hsfs/hsfs_vfsops.c:

The modifications in this file deal with initializing the
required data structures during mount. A global variable is
checked to determine whether to enable these features or
not. This is done from a debugging perspective:

int do_schedio = 1;

The variable can be toggled via mdb prior to mounting to
disable these features.

static int
hs_mountfs(
...
...
       if (do_schedio) {
               fsp->hqueue = kmem_alloc(sizeof(struct hsfs_queue), KM_SLEEP);
               hsched_init(fsp, fsid, &modlinkage);
       }
...
}

Cleanup is done via a call to hsched_fini in hsfs_unmount.


usr/src/uts/common/sys/fs/hsfs_node.h

The read-ahead and I/O scheduler data structures are defined
in this file. Read-ahead computation adds 3 new variables to
the hsnode structure:

u_offset_t      hs_prev_offset; /* Last read end offset (readahead) */
int             hs_num_contig;  /* Count of contiguous reads */
int             hs_ra_bytes;    /* Bytes to readahead */

The other data structures are:

struct hio - A structure that holds information for a read
             request that is enqueued for processing by the
             scheduling function. An AVL tree is used to
             access the read requests in a sorted manner.

struct hio_info - A structure that holds information about
	     all the read requests issued during a read-ahead
             invocation. This is then enqueued on a task-
             queue for processing by a thread that takes
             this read-ahead to completion and cleans up.

struct hsfs_queue - This is per-filesystem structure that
             stores toplevel data structures for the I/O
             scheduler.

The hsfs filesystem structure is obviously modified to contain
a pointer to a struct hsfs_queue.

usr/src/uts/common/fs/hsfs/hsfs_node.c

Very simple changes to initialize the read-ahead counters
when initializing a hsnode.


usr/src/uts/common/fs/hsfs/hsfs_vnops.c

This file contains 90% of the changes. Most of it is new code
addition with changes to the hsfs_read, hsfs_getpage and
hsfs_getapage routines.

The changes to hsfs_read deal with doing the freebehind
correctly if read-ahead is in effect. This is not different
from the same implementation in UFS.

The changes to hsfs_getpage deal with updating the read-ahead
counters based on the vnode, offset and length of data being
read and what was the end offset of the previous read on the
same file. The code is fairly commented.

The changes to hsfs_getapage deal with creating the struct
hio requests and enqueuing them for processing by the I/O
scheduler. It also checks for read-ahead and invokes the
read-ahead routine if several conditions are met. These
conditions were mentioned towards the beginning of this
document (5th para).

It is pertinent to note here the I/O scheduling function
hsched_invoke_strategy does Not run in a separate thread.
Instead the caller, hsfs_getapage is expected to repeatedly
invoke this function till it's I/O requests have been
satisfied. In practice this has found to give low-overhead
high performance. Since the hsched_invoke_strategy acquires
the strategy_lock on entry to ensure single-threaded operation,
all except one thread will be sleeping on this mutex rather
than busy-waiting that the text above seems to imply.

All of the new code in hsfs_getapage is commented and not
too difficult to follow. At all points a check is made to
see whether these features are to be used - for debuggability.

This code calls the read-ahead function if we have a cache
hit, we are doing sequential read and the next page is not
in the cache:

if (fsp->hqueue != NULL &&
         hp->hs_prev_offset - off == pgsize &&
         hp->hs_prev_offset < filsiz &&
         hp->hs_ra_bytes > 0 &&
         !page_exists(vp,hp->hs_prev_offset)) {
         hsfs_getpage_ra(vp, hp->hs_prev_offset, seg,
                 addr + pgsize, hp, fsp, xarsiz, bof,
                 chunk_lbn_count, chunk_data_bytes);
}

hsfs_getpage_ra is essentially a simplified version of
hsfs_getapage. It does most of the same processing but
puts the read requests on a queue for processing via a
background kernel thread:

bufsused = count;
info = kmem_alloc(sizeof (struct hio_info), KM_SLEEP);
info->bufs = bufs;
info->vas = vas;
info->sema = fio_done;
info->bufsused = bufsused;
info->bufcnt = bufcnt;
info->hqueue = fsp->hqueue;
info->pp = pp;

(void) taskq_dispatch(fsp->hqueue->ra_task,
			hsfs_ra_task, info, KM_SLEEP);


hsfs_ra_task runs when the ra_task queue has been fed
some requests. It in turn invokes the scheduling function
until it's requests are serviced. It then does a cleanup
and releases the I/O lock on the pages. The ra_task
queue is a dynamic task Q since it is performance sensitive.
To be fully effective the read-ahead should complete before
or just-before the application comes back with the request
for that page. For a purely sequential single-threaded read
from DVD a drop in throughput was observed in practice when
using a non-dynamic task Q.

hsched_invoke_strategy contains the real meat of the I/O
scheduler. First it grabs it's own lock and then it
grabs the lock that protects the AVL trees. It then checks
the deadline tree to see whether the oldest requests has
exceeded the deadline. If yes then that request is used
as the starting point.

Otherwise it looks at the read tree sorted ascending by
LBN and fetches the request with the next higher block
number from the read request that was processed earlier.
If there are no such requests in the queue then it fetches
the one with the lowest logical block number. This is what
gives the Circular Look behavior. This is the code segment
responsible for that:

fio = avl_find(&hqueue->read_tree, hqueue->next, &pos);
if (fio != NULL)
	fio = AVL_NEXT(&hqueue->read_tree, fio);
else
	fio = avl_nearest(&hqueue->read_tree, pos, AVL_AFTER);

if (fio == NULL) {
	fio = avl_first(&hqueue->read_tree);
}

Here hqueue->next is a dummy struct hio that holds the
logical block number of the last processed read request.
avl_find will either return a node having the given
value or if it does not exist will return NULL and pos
will point to the insertion point. Both cases are handled.

Subsequently the code does a forward(ascending block number)
coalescing of buffers that are adjacent to each other. The
avl tree is traversed in order via AVL_NEXT and all the
adjacent buffers are put into a linked list.

Next if adjacent buffers were detected then a new buf
structure is synthesized. This is somewhat different from
a normal buf that one would get via getrbuf or bioclone.
In particular the buf points to a kmem_alloc-ed chunk.
Also the buf structure itself is allocated once during
mount and re-used every time through the scheduling
function as it is single-threaded.

Finally this buf is then dispatched and it waits for
the I/O to complete. Once data was received successfully
then the blocks are copied back into the original bufs
that have been sent down from the caller and biodone is
signaled for each.

Error is handled by looking at the b_resid buf member.
b_resid will indicate how much data was not processed
for the I/O. So we can find out which of the caller's
original bufs are good to go and which failed and signal
appropriately.


Initialization: hsched_init performs the initialization
and is called during mount. It sets up the mutexes, the
avl trees and the read-ahead taskQ. The maximum I/O
transfer size supported by the device is probed using
ldi_ioctl, the default is to assume a conservative
value of 16K in case the ldi_ioctl is not successful.
The read-ahead size is also set here. Ordinarily we'd
read-ahead 4 pages worth of data, but it is reduced to
1 page in case we are using large pages.

The function hsched_fini does all the cleanup.
hsfs_deadline_compare and hsfs_offset_compare are the
comparison functions used for the avl trees. These look
and behave suspiciously similar to similar functions in:
usr/src/uts/common/fs/zfs/vdev_queue.c
I spent some time testing this stuff quite a bit and used filebench as well. Interestingly filebench (and probably even iozone) are written with writable filesystems in mind. I ran into problems using it on hsfs which is read-only. So I had to make changes to filebench to get it to work with hsfs. For eg. it opens files with O_RDWR even for the read tests. The workload scripts needed a change to use predefined files instead of creating a new one.

I used the random multi-thread read, single-stream sequential read and multi-stream sequential read tests with various chunk sizes.

Using filebench gave me some more insights and helped improve the code/performance. Here are the changes I did since last time:

- The logic used to implement the 1-way Elevator (Circular Look) was too expensive.
  It involved two traversals of the AVL tree while holding a lock. Here's what I
  was doing earlier:

            fio = avl_find(&hqueue->read_tree, hqueue->next, &pos);
            if (fio != NULL)
                    fio = AVL_NEXT(&hqueue->read_tree, fio);
            else
                    fio = avl_nearest(&hqueue->read_tree, pos,
                                                    AVL_AFTER);

  The combination of avl_find and avl_nearest was too expensive. After a bunch of
  meddling I hit upon a way to use the last processed I/O node of the current
  invocation as a sentinel for the next invocation.
  That way the code boils down to just a simple AVL_NEXT:

                  fio = AVL_NEXT(&hqueue->read_tree, hqueue->next);
                  avl_remove(&hqueue->read_tree, hqueue->next);

   This made a difference.

- filebench showed a degradation in performance for small 2k-4k random I/O by
  multiple threads whereas bigger chunks of I/O showed a big benefit.

  That essentially boils down to the non-coalescing case. The last else case in
  hsched_invoke_strategy. That'd simply issue a bdev_strategy and biowait and then
  release the I/O lock. That was not enough to keep the I/O pipe and the device
  sufficiently busy. So I changed it to release the lock before calling biowait as
  at that point there is no shared data to worry about. This change actually
  improved the small random read performance compared to the vanilla hsfs and the
  benefits of re-ordering were visible.

- The normal hsfs module always issues reads in 2K chunks. This was a result of the
  need to support file data interleaving on older hardware. However from what I see
  interleaving is hardly used today. HSFS computes the interleaving chunk size and
  sets it to the HSFS logical block size of 2K when there is no interleaving. This
  is wasteful and results in 2K reads even when the file data is contiguous and we
  can read whole pages at a time.

  Thus I made a small tweak to actually set the chunk size to the page size if the
  page size is a multiple of the logical block size. This resulted in much lesser
  processing overhead and the I/O scheduler is better able to coalesce:

    if (hp->hs_dirent.intlf_sz == 0) {
            chunk_data_bytes = LBN_TO_BYTE(1, vp->v_vfsp);
            /*
             * Optimization: If our pagesize is a multiple of LBN
             * bytes, we can avoid breaking up a page into individual
             * lbn-sized requests.
             */
            if (pgsize % chunk_data_bytes == 0) {
                    chunk_lbn_count = BYTE_TO_LBN(pgsize, vp->v_vfsp);
                    chunk_data_bytes = pgsize;
            }
   ...
##################################################
# Test Results
##################################################
A bunch of testing was performed on a Thinkpad T60p laptop and the results are posted below. Testing on SPARC is discussed on another note. Filebench was used to collect metrics on a bunch of testcases. Filebench had to be modified slightly to make it work with a read-only filesystem.

Since these are performance tests first the baseline metrics and metrics from the enhanced module are compared below. A test DVD with 3 1GB files were used. A SXDE B70 DVD was also used in the tests.

It will be clear from the results below that there is a general improvement in performance sometimes upto 50% reduction in the time taken. In addition the reduction in system time is also quite huge. The biggest reason for this is indicated by the DTrace outputs at the end. The current hsfs module always issues reads on 2K size which results in a huge number of requests going down to the device. The enhanced module on the other hand never issues 2K requests. It is 4K or larger, resulting in decrease in number of physical reads and reduction in system load.

The iostat outputs at the end clearly show increased throughput. Since this is was a laptop, the DVD drive takes more time to reach max throughput. On a desktop in earlier use the throughput was observed to go near to the device maximum when copying large files.

------------------
Straight file copy
------------------
## BASELINE ##
bash-3.00# time cp /mnt/cdrom/largefile1 /space/

real    6m3.944s
user    0m0.002s
sys     0m20.527s

## ENHANCED ##
bash-3.00# time cp /mnt/cdrom1/largefile1 /space/

real    3m6.583s
user    0m0.001s
sys     0m15.917s

----------------
dd of a 1GB File
----------------
## BASELINE ##
bash-3.00# time dd if=/media/TestDVD/largefile1 of=/dev/null bs=8192 count=131072
131072+0 records in
131072+0 records out

real    6m4.996s
user    0m0.228s
sys     0m12.340s

## ENHANCED ##
bash-3.00# time dd if=/media/TestDVD/largefile1 of=/dev/null bs=8192 count=131072
131072+0 records in
131072+0 records out

real    3m10.065s
user    0m0.170s
sys     0m3.033s

----------------------
cpio of a SXDE B70 DVD
----------------------
## BASELINE ##
bash-3.00# time find . | cpio -pdum /space/work/dvd
6944256 blocks

real    36m15.867s
user    0m1.885s
sys     2m46.268s

## ENHANCED ##
bash-3.00# time find . | cpio -pdum /space/work/dvd
6944256 blocks

real    28m22.953s
user    0m1.467s
sys     0m37.423s

---------------------
Tar of a SXDE B70 DVD
---------------------
## BASELINE ##
bash-3.00# time tar cpf - . | cat > /dev/null
  
real    86m56.636s
user    0m2.822s
sys     2m12.932s

## ENHANCED ##
bash-3.00# time tar cpf - . | cat > /dev/null

real    79m31.187s
user    0m2.438s
sys     0m30.000s

--------------------------------------------------------------------------------
Tar of a lofi mounted SXDE ISO image, the ISO image residing on a UFS filesystem
--------------------------------------------------------------------------------
bash-3.00# time tar cpf - . | cat > /dev/null

real    3m7.076s
user    0m3.108s
sys     1m33.591s

## ENHANCED ##
bash-3.00# time tar cpf - . | cat > /dev/null

real    2m26.876s
user    0m2.728s
sys     0m27.414s

--------------------------------------------------------------------------------
Tar of a lofi mounted SXDE ISO image, the ISO image residing on a ZFS filesystem
--------------------------------------------------------------------------------
## BASELINE ##
bash-3.00# time tar cpf - . | cat > /dev/null

real    2m38.084s
user    0m3.162s
sys     1m34.831s

## ENHANCED ##
bash-3.00# time tar cpf - . | cat > /dev/null

real    1m35.409s
user    0m2.616s
sys     0m26.476s

------------------------------------------------------------------------
Number of physical I/O requests issued by hsfs from the dd of a 1GB file
------------------------------------------------------------------------
## BASELINE ##
bash-3.00# dtrace -n 'fbt::bdev_strategy:entry { @[execname] = count(); }'
dtrace: description 'fbt::bdev_strategy:entry ' matched 1 probe
^C

  fsflush                                                           4
  firefox-bin                                                       5
  sched                                                            59
  dd                                                           524286

## ENHANCED ##
bash-3.00# dtrace -n 'fbt::bdev_strategy:entry { @[execname] = count(); }'
dtrace: description 'fbt::bdev_strategy:entry ' matched 1 probe
^C

  dd                                                               13
  fsflush                                                          19
  sched                                                         65694

------------------------------------------------------
Block sizes issued from hsfs due to a dd of a 1GB file
------------------------------------------------------
## BASELINE ##
bash-3.00# dtrace -n 'io:::start { @[execname, args[2]->fi_pathname] = lquantize(args[0]->b_bcount, 0, 32767, 2048); }'
dtrace: description 'io:::start ' matched 3 probes
^C
...
...

  dd                                                  /media/TestDVD/largefile3                         
           value  ------------- Distribution ------------- count    
               0 |                                         0        
            2048 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 524288   
            4096 |                                         0        

## ENHANCED ##
>> The Blocks accounted to sched below are actually the async read-ahead requests from hsfs
>>
bash-3.00# dtrace -n 'io:::start { @[execname, args[2]->fi_pathname] = lquantize(args[0]->b_bcount, 0, 65536, 2048); }'
dtrace: description 'io:::start ' matched 3 probes
^C
...
...

  dd                                                  /media/TestDVD/largefile3                         
           value  ------------- Distribution ------------- count    
            2048 |                                         0        
            4096 |@@@@@@@                                  2        
            6144 |                                         0        
            8192 |                                         0        
           10240 |                                         0        
           12288 |@@@@                                     1        
           14336 |                                         0        
           16384 |                                         0        
           18432 |                                         0        
           20480 |@@@@                                     1        
           22528 |                                         0        
           24576 |@@@@@@@@@@@@@@@@@@@@@@@@@                7        
           26624 |                                         0        

  sched                                               /media/TestDVD/largefile3                         
           value  ------------- Distribution ------------- count    
           10240 |                                         0        
           12288 |                                         1        
           14336 |                                         0        
           16384 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 65522    
           18432 |                                         0        

-----------------------------------------
Iostat output snippet from dd of 1GB file
-----------------------------------------
## BASELINE ##
                    extended device statistics              
    r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c1t0d0
 1515.6    0.0 3031.2    0.0  0.4  1.0    0.3    0.6  43  96 c0t0d0
     cpu
 us sy wt id
  1  7  0 91
                    extended device statistics              
    r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
    0.0    0.2    0.0    0.2  0.0  0.0    0.0    1.5   0   0 c1t0d0
 1515.4    0.0 3030.8    0.0  0.4  1.0    0.3    0.6  43  96 c0t0d0
     cpu
 us sy wt id
  1  7  0 92
                    extended device statistics              
    r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c1t0d0
 1506.8    0.0 3013.6    0.0  0.4  1.0    0.3    0.6  43  96 c0t0d0


## ENHANCED ##
                    extended device statistics              
    r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c1t0d0
  467.4    0.0 5608.9    0.0  0.0  0.9    0.0    2.0   0  94 c0t0d0
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c2t0d0
...
...
     cpu
 us sy wt id
  2  6  0 92
                    extended device statistics              
    r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c1t0d0
  548.4    0.0 6581.2    0.0  0.0  0.9    0.0    1.7   0  93 c0t0d0
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c2t0d0
     cpu
 us sy wt id
  2  6  0 92
                    extended device statistics              
    r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c1t0d0
  553.0    0.0 6636.0    0.0  0.0  0.9    0.0    1.7   0  93 c0t0d0
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c2t0d0
##################################################
# Filebench test results on x86
##################################################

12 Filebench testcases were used:

- Random read with 2K block size, 16 threads
- Random read with 8K block size, 16 threads
- Random read with 32K block size, 16 threads
- Random read with 64K block size, 16 threads
- Random read with 1M block size, 16 threads
- Single stream read with 8K blocksize
- Single stream read with 32K blocksize
- Single stream read with 1M blocksize
- Multi stream read with 8K blocksize
- Multi stream read with 32K blocksize
- Multi stream read with 1M blocksize
- Mixed read 4 threads doing a mixture of random and sequential I/O on same/different files

The Test DVD used for this had 3 1GB files. The results are reproduced below. It is clear that there is an across the board benefit though the benefits for random read are marginal. The single stream reads show massive benefit due to I/O coalescing and read-ahead caching that provide the illusion of higher bandwidth than the device can support to the application.

The Filebench and previous tar tests indicate possibilities for additional improvement in terms of better caching metadata to reduce seeks and doing read-ahead even for random reads with large chunk sizes. However those are possibilities for another rfe.

---------------------------
 randomread2k 
---------------------------
## BASELINE ##
Flowop totals:
rand-rate                   0ops/s   0.0mb/s      0.0ms/op        0us/op-cpu
rand-read1                  0ops/s   0.0mb/s      0.0ms/op        0us/op-cpu
rand-rate                   0ops/s   0.0mb/s      0.0ms/op        0us/op-cpu
rand-read1                 15ops/s   0.0mb/s   2149.3ms/op       29us/op-cpu

IO Summary:          4474 ops     14.8 ops/s,       15/0 r/w     0.0mb/s,      624uscpu/op
 
## ENHANCED ##
Flowop totals:
rand-rate                   0ops/s   0.0mb/s      0.0ms/op        0us/op-cpu
rand-read1                  0ops/s   0.0mb/s      0.0ms/op        0us/op-cpu
rand-rate                   0ops/s   0.0mb/s      0.0ms/op        0us/op-cpu
rand-read1                 15ops/s   0.0mb/s   2156.4ms/op       26us/op-cpu

IO Summary:          4494 ops     14.9 ops/s,       15/0 r/w     0.0mb/s,      479uscpu/op
 
---------------------------
 randomread8k 
---------------------------
## BASELINE ##
Flowop totals:
rand-rate                   0ops/s   0.0mb/s      0.0ms/op        0us/op-cpu
rand-read1                  0ops/s   0.0mb/s      0.0ms/op        0us/op-cpu
rand-rate                   0ops/s   0.0mb/s      0.0ms/op        0us/op-cpu
rand-read1                 15ops/s   0.1mb/s   2191.6ms/op       45us/op-cpu

IO Summary:          4383 ops     14.5 ops/s,       15/0 r/w     0.1mb/s,      744uscpu/op
 
## ENHANCED ##
Flowop totals:
rand-rate                   0ops/s   0.0mb/s      0.0ms/op        0us/op-cpu
rand-read1                  0ops/s   0.0mb/s      0.0ms/op        0us/op-cpu
rand-rate                   0ops/s   0.0mb/s      0.0ms/op        0us/op-cpu
rand-read1                 15ops/s   0.1mb/s   2150.8ms/op       42us/op-cpu

IO Summary:          4508 ops     14.9 ops/s,       15/0 r/w     0.1mb/s,     1278uscpu/op
 
---------------------------
 randomread32k 
---------------------------
## BASELINE ##
Flowop totals:
rand-rate                   0ops/s   0.0mb/s      0.0ms/op        0us/op-cpu
rand-read1                  0ops/s   0.0mb/s      0.0ms/op        0us/op-cpu
rand-rate                   0ops/s   0.0mb/s      0.0ms/op        0us/op-cpu
rand-read1                  3ops/s   0.1mb/s   9669.6ms/op      163us/op-cpu

IO Summary:           984 ops      3.3 ops/s,        3/0 r/w     0.1mb/s,     3096uscpu/op
 
## ENHANCED ##
Flowop totals:
rand-rate                   0ops/s   0.0mb/s      0.0ms/op        0us/op-cpu
rand-read1                  0ops/s   0.0mb/s      0.0ms/op        0us/op-cpu
rand-rate                   0ops/s   0.0mb/s      0.0ms/op        0us/op-cpu
rand-read1                  3ops/s   0.1mb/s   9609.6ms/op      145us/op-cpu

IO Summary:           997 ops      3.3 ops/s,        3/0 r/w     0.1mb/s,     2344uscpu/op
 
---------------------------
 randomread64k 
---------------------------
## BASELINE ##
Flowop totals:
rand-rate                   0ops/s   0.0mb/s      0.0ms/op        0us/op-cpu
rand-read1                  0ops/s   0.0mb/s      0.0ms/op        0us/op-cpu
rand-rate                   0ops/s   0.0mb/s      0.0ms/op        0us/op-cpu
rand-read1                  2ops/s   0.1mb/s  19598.3ms/op      310us/op-cpu

IO Summary:           484 ops      1.6 ops/s,        2/0 r/w     0.1mb/s,     6535uscpu/op
 
## ENHANCED ##
Flowop totals:
rand-rate                   0ops/s   0.0mb/s      0.0ms/op        0us/op-cpu
rand-read1                  0ops/s   0.0mb/s      0.0ms/op        0us/op-cpu
rand-rate                   0ops/s   0.0mb/s      0.0ms/op        0us/op-cpu
rand-read1                  2ops/s   0.1mb/s  19641.2ms/op      288us/op-cpu

IO Summary:           481 ops      1.6 ops/s,        2/0 r/w     0.1mb/s,     6536uscpu/op
 
---------------------------
 randomread1m 
---------------------------
## BASELINE ##
Flowop totals:
rand-rate                   0ops/s   0.0mb/s      0.0ms/op        0us/op-cpu
rand-read1                  0ops/s   0.0mb/s      0.0ms/op        0us/op-cpu
rand-rate                   0ops/s   0.0mb/s      0.0ms/op        0us/op-cpu
rand-read1                  0ops/s   0.1mb/s 326255.2ms/op     5036us/op-cpu

IO Summary:            32 ops      0.1 ops/s,        0/0 r/w     0.1mb/s,   184047uscpu/op
 
## ENHANCED ##
Flowop totals:
rand-rate                   0ops/s   0.0mb/s      0.0ms/op        0us/op-cpu
rand-read1                  0ops/s   0.0mb/s      0.0ms/op        0us/op-cpu
rand-rate                   0ops/s   0.0mb/s      0.0ms/op        0us/op-cpu
rand-read1                  0ops/s   0.1mb/s 300453.0ms/op     4166us/op-cpu

IO Summary:            34 ops      0.1 ops/s,        0/0 r/w     0.1mb/s,   165575uscpu/op
 
---------------------------
 singlestreamread8k 
---------------------------
## BASELINE ##
Flowop totals:
limit                       0ops/s   0.0mb/s      0.0ms/op        0us/op-cpu
seqread                   366ops/s   2.9mb/s      2.7ms/op       98us/op-cpu

IO Summary:        110724 ops    366.2 ops/s,      366/0 r/w     2.9mb/s,      363uscpu/op
 
## ENHANCED ##
Flowop totals:
limit                       0ops/s   0.0mb/s      0.0ms/op        0us/op-cpu
seqread                 19754ops/s 154.3mb/s      0.0ms/op       17us/op-cpu

IO Summary:       5969765 ops  19754.4 ops/s,    19754/0 r/w   154.3mb/s,       18uscpu/op
 
---------------------------
 singlestreamread32k 
---------------------------
## BASELINE ##
Flowop totals:
limit                       0ops/s   0.0mb/s      0.0ms/op        0us/op-cpu
seqread                    92ops/s   2.9mb/s     10.9ms/op      370us/op-cpu

IO Summary:         27754 ops     91.8 ops/s,       92/0 r/w     2.9mb/s,     1433uscpu/op
 
## ENHANCED ##
Flowop totals:
limit                       0ops/s   0.0mb/s      0.0ms/op        0us/op-cpu
seqread                  7185ops/s 224.5mb/s      0.1ms/op       52us/op-cpu

IO Summary:       2172130 ops   7184.6 ops/s,     7185/0 r/w   224.5mb/s,       57uscpu/op
 
---------------------------
 singlestreamread1m 
---------------------------
## BASELINE ##
Flowop totals:
limit                       0ops/s   0.0mb/s      0.0ms/op        0us/op-cpu
seqread                     3ops/s   2.9mb/s    347.9ms/op    11671us/op-cpu

IO Summary:           867 ops      2.9 ops/s,        3/0 r/w     2.9mb/s,    45757uscpu/op
 
## ENHANCED ##
Flowop totals:
limit                       0ops/s   0.0mb/s      0.0ms/op        0us/op-cpu
seqread                   268ops/s 268.1mb/s      3.7ms/op     1484us/op-cpu

IO Summary:         81049 ops    268.1 ops/s,      268/0 r/w   268.1mb/s,     1633uscpu/op
 
---------------------------
 multistreamread8k 
---------------------------
## BASELINE ##
Flowop totals:
limit                       0ops/s   0.0mb/s      0.0ms/op        0us/op-cpu
seqread3                    2ops/s   0.0mb/s    452.8ms/op       41us/op-cpu
limit                       0ops/s   0.0mb/s      0.0ms/op        0us/op-cpu
seqread2                    2ops/s   0.0mb/s    452.6ms/op       41us/op-cpu
limit                       0ops/s   0.0mb/s      0.0ms/op        0us/op-cpu
seqread1                    2ops/s   0.0mb/s    452.4ms/op       42us/op-cpu

IO Summary:          2001 ops      6.6 ops/s,        7/0 r/w     0.0mb/s,     1338uscpu/op
 
## ENHANCED ##
Flowop totals:
limit                       0ops/s   0.0mb/s      0.0ms/op        0us/op-cpu
seqread3                    4ops/s   0.0mb/s    261.0ms/op       37us/op-cpu
limit                       0ops/s   0.0mb/s      0.0ms/op        0us/op-cpu
seqread2                    5ops/s   0.0mb/s    204.0ms/op       31us/op-cpu
limit                       0ops/s   0.0mb/s      0.0ms/op        0us/op-cpu
seqread1                    5ops/s   0.0mb/s    203.5ms/op       31us/op-cpu

IO Summary:          4071 ops     13.5 ops/s,       13/0 r/w     0.1mb/s,      877uscpu/op
 
---------------------------
 multistreamread32k 
---------------------------
## BASELINE ##
Flowop totals:
limit                       0ops/s   0.0mb/s      0.0ms/op        0us/op-cpu
seqread3                    1ops/s   0.0mb/s   1803.5ms/op      151us/op-cpu
limit                       0ops/s   0.0mb/s      0.0ms/op        0us/op-cpu
seqread2                    1ops/s   0.0mb/s   1805.5ms/op      135us/op-cpu
limit                       0ops/s   0.0mb/s      0.0ms/op        0us/op-cpu
seqread1                    1ops/s   0.0mb/s   1804.6ms/op      147us/op-cpu

IO Summary:           501 ops      1.7 ops/s,        2/0 r/w     0.0mb/s,     5440uscpu/op
 
## ENHANCED ##
Flowop totals:
limit                       0ops/s   0.0mb/s      0.0ms/op        0us/op-cpu
seqread3                    1ops/s   0.0mb/s    747.1ms/op      100us/op-cpu
limit                       0ops/s   0.0mb/s      0.0ms/op        0us/op-cpu
seqread2                    0ops/s   0.0mb/s   5424.4ms/op      754us/op-cpu
limit                       0ops/s   0.0mb/s      0.0ms/op        0us/op-cpu
seqread1                    1ops/s   0.0mb/s    744.7ms/op      107us/op-cpu

IO Summary:           872 ops      2.9 ops/s,        3/0 r/w     0.1mb/s,     2596uscpu/op
 
---------------------------
 multistreamread1m 
---------------------------
## BASELINE ##
Flowop totals:
limit                       0ops/s   0.0mb/s      0.0ms/op        0us/op-cpu
seqread3                    0ops/s   0.0mb/s  57412.9ms/op     4432us/op-cpu
limit                       0ops/s   0.0mb/s      0.0ms/op        0us/op-cpu
seqread2                    0ops/s   0.0mb/s  57478.9ms/op     4466us/op-cpu
limit                       0ops/s   0.0mb/s      0.0ms/op        0us/op-cpu
seqread1                    0ops/s   0.0mb/s  57449.6ms/op     4272us/op-cpu

IO Summary:            15 ops      0.0 ops/s,        0/0 r/w     0.0mb/s,   151319uscpu/op
 
## ENHANCED ##
Flowop totals:
limit                       0ops/s   0.0mb/s      0.0ms/op        0us/op-cpu
seqread3                    0ops/s   0.0mb/s  71125.5ms/op     9738us/op-cpu
limit                       0ops/s   0.0mb/s      0.0ms/op        0us/op-cpu
seqread2                    0ops/s   0.0mb/s  21493.7ms/op     3104us/op-cpu
limit                       0ops/s   0.0mb/s      0.0ms/op        0us/op-cpu
seqread1                    0ops/s   0.0mb/s  21687.5ms/op     3051us/op-cpu

IO Summary:            30 ops      0.1 ops/s,        0/0 r/w     0.1mb/s,   183635uscpu/op
 
---------------------------
 mixedread 
---------------------------
## BASELINE ##
Flowop totals:
rand-rate                   0ops/s   0.0mb/s      0.0ms/op        0us/op-cpu
rand-read2                  0ops/s   0.0mb/s  77590.5ms/op     4326us/op-cpu
limit                       0ops/s   0.0mb/s      0.0ms/op        0us/op-cpu
seqread1                    0ops/s   0.0mb/s  77433.9ms/op     4409us/op-cpu
limit                       0ops/s   0.0mb/s      0.0ms/op        0us/op-cpu
seqread                     2ops/s   0.0mb/s    609.1ms/op       43us/op-cpu
rand-rate                   0ops/s   0.0mb/s      0.0ms/op        0us/op-cpu
rand-read1                  0ops/s   0.0mb/s      0.0ms/op        0us/op-cpu
rand-rate                   0ops/s   0.0mb/s      0.0ms/op        0us/op-cpu
rand-read1                  4ops/s   0.0mb/s    554.5ms/op       34us/op-cpu

IO Summary:          1592 ops      5.3 ops/s,        5/0 r/w     0.0mb/s,     1606uscpu/op
 
## ENHANCED ##
Flowop totals:
rand-rate                   0ops/s   0.0mb/s      0.0ms/op        0us/op-cpu
rand-read2                  0ops/s   0.0mb/s  94251.1ms/op     9330us/op-cpu
limit                       0ops/s   0.0mb/s      0.0ms/op        0us/op-cpu
seqread1                    0ops/s   0.0mb/s  87424.1ms/op     7679us/op-cpu
limit                       0ops/s   0.0mb/s      0.0ms/op        0us/op-cpu
seqread                     4ops/s   0.0mb/s    224.2ms/op       32us/op-cpu
rand-rate                   0ops/s   0.0mb/s      0.0ms/op        0us/op-cpu
rand-read1                  0ops/s   0.0mb/s      0.0ms/op        0us/op-cpu
rand-rate                   0ops/s   0.0mb/s      0.0ms/op        0us/op-cpu
rand-read1                  3ops/s   0.0mb/s    616.5ms/op       60us/op-cpu

IO Summary:          2351 ops      7.8 ops/s,        8/0 r/w     0.1mb/s,     1669uscpu/op

##################################################
# Filebench test results on SPARC
##################################################

The same Filebench tests as before were run on a T2000 system and similar results appeared. The random access tests actually delivered better numbers compared to baseline on the T2000.

These tests were also repeated with kmem_flags = 0x1f.

Please see comments for detailed SPARC results. We have a massive bug description field and we are hitting the limit on how much text it can contain.
Work Around
N/A
Comments
N/A