OpenSolaris

Printable Version Enter a New Search
Bug ID 6721168
Synopsis slog latency impacted by I/O scheduler during spa_sync
State 3-Accepted (Yes, that is a problem)
Category:Subcategory kernel:zfs
Keywords AR1.2 | rel_note | zfs-perf
Reported Against
Duplicate Of
Introduced In
Commit to Fix
Fixed In
Release Fixed
Related Bugs 6470056 , 6586537 , 6687412 , 6699227 , 6806882 , 6826241
Submit Date 1-July-2008
Last Update Date 11-November-2009
Description
Looking at a pool with a ssd based separate intent log with a workload that 
runs N clients each doing 256 concurrent synchronous ~8K writes over NFS. 
All using the same FS from a single pool. 

Normally, I see peaks of 20K+ writes per second (with 2 slog devices) on the server and ~2K zil_commit_writer per second. However during an spa_sync() we get much fewer zil_commit_writers; the latency of the calls go up considerably.   This in turn causes more zfs_write to  aggregate per zil_commit_writer which is normal.
The problem is a  disproportionately large amount of zio_wait : if we're handling 3X more work per zil_commit_writer; we're waiting 30X more in zio_wait().

The attched script latoff3.d reports the avg latency of zil_commit_writer and 
the avg number of writes per zil_commit_writer. We also report the avg time spent
in zio_wait. The more writes per commit the more wait is expected (time to drain memory to slog devices) but it should be more or less proportional. The numbers are separated
in 2 categories, during synch phase or outside of it. 
Cote :We needed to set zfs_no_write_throttle to work around 6687412.
Scripts needs to run for >60 seconds to capture multiple sync phases.

While not synching, we handle 11 writes per zil_commit and zio_wait for 256 usec per zil_commit_writer.

  avg write cnt     not-synching        11
  avg zio_wait ns   not-synching    256106
  avg commit ns     not-synching    481365

While synching, we handle 33 writes per zil_commit and zio_wait for 8342 usec per zil_commit_writer.

  avg write cnt     synching            33
  avg zio_wait ns   synching       8342496
  avg commit ns     synching       9044736

So here we're doing 33/11 = 3 times more work per zil_commit_writer for 8342496/256106 = 33 times more zio_wait.

What I think this says is that during sync phase, the I/O scheduler is inducing extra latency even for the critical zio that went to the slog devices. What this is causing is a cyclical drop in the number of zfs_write that are handled; the latency of them is also impacted.

I produced this to an AR (fw_28 ~snv_87) using 6 clients (v2c02...) running the attached oltp2 benchmark.
Mountpoint is /v4

	for h in v2c02 v2c04 v2c05 v2c06 v2c07 v2c08; do
	ssh root@$h TIME=10 RRRATIO=100 oltp2 arhost /v4 1000000 100 1
	done
Work Around
N/A
Comments
N/A