|
Description
|
I noticed that a straight 'ls' with no flags that cause it to stat each
file was taking much longer than I expected. I realized that the ZPL is
prefetching the dnodes/znodes associated with all the files, even though
we don't need them in this case. By disabling the call to
dmu_prefetch() in zfs_readdir(), 'ls' took 5x less time on a directory
with 1 million entries. This is on a 2x2GHz opteron with a single IDE
disks, kmem_flags=0 zfs_flags=0. Same results with 2 mirrored IDE
disks. So I think we need to figure out how to not read in the znodes
when we don't need them.
Out of curiosity, I tested 'ls -l' with and without the prefetch as
well, to verify that the prefetch actually helps. I found that the
prefetching did improve things, not so much on a single disk, but with 2
mirrored or striped it was 2x.
xxxxx@xxxxx.com 2005-07-15 11:03:46 GMT
I've just finished some testing with the current bits and
have experienced different results from previous seen
a few months back (albeit on different hardware).
I now see that prefetching slows down both the /bin/ls case
(as seen before) but also the "/bin/ls -l" case (which previously
Matt saw a 2X perf improvement). Here's my observations:
gloomy.sfbay
zpool create whirl c1t2d0
write cache enabled
524288 files
prefetch on:
time /bin/ls > /dev/null
real 52.1s, user 7.7s, sys 44.1s
prefetch off:
time /bin/ls > /dev/null
real 38.1, user 7.7s, sys 30.1s
prefetch on:
time /bin/ls -l > /dev/null
real 6m32s, user 27s, sys 6m2s
prefetch off:
time /bin/ls -l > /dev/null
real 6m24s, user 27s, sys 5m52s
------
mull.central
zpool create whirl c1t8d0 c1t9d0
write cache enabled
524288 files
prefetch on:
time /bin/ls > /dev/null
real 76s, user 9s, sys 45s
prefetch off:
time /bin/ls > /dev/null
real 60s, user 9s, sys 30s
prefetch on:
time /bin/ls -l > /dev/null
real 7m9s, user 30s, sys 6m10s
prefetch off:
time /bin/ls -l > /dev/null
real 6m40s, user 31s, sys 6m3s
Maybe Matts ZAP improvements have changed the performance
in this area. Anyway it looks to me like we should just
rip out the prefetch code with the current bits.
|