OpenSolaris

Printable Version Enter a New Search
Bug ID 6509628
Synopsis unmount of a snapshot (from 'zfs destroy') is slow
State 10-Fix Delivered (Fix available in build)
Category:Subcategory kernel:zfs
Keywords
Responsible Engineer Eric Kustarz
Reported Against s10u2_fcs
Duplicate Of
Introduced In solaris_nevada
Commit to Fix snv_71
Fixed In snv_71
Release Fixed solaris_nevada(snv_71) , solaris_10u6(s10u6_02) (Bug ID:2156287)
Related Bugs 6537472
Submit Date 4-January-2007
Last Update Date 29-April-2008
Description
If you take a snapshot of a filesystem than back it up using tar, cpio, pax or anyother archiver so that all the files have been read and then destroy that file system. The destroy operation takes unreasaobaly long time to complete. During that time one CPU is pegged.

Here tank/test contains a root file system:

v4u-880k-gmp03 516 # zfs  snapshot tank/test@6
v4u-880k-gmp03 517 # time tar cf /dev/null /tank/test/.zfs/snapshot/6
tar: /tank/test/.zfs/snapshot/6/usr/jdk/instances/jdk1.5.0/jre/lib/sparc/cpu/sparcv9+vis/sparcv9/libclib_jiio.so: symbolic link too long
tar: /tank/test/.zfs/snapshot/6/usr/jdk/instances/jdk1.5.0/jre/lib/sparc/cpu/sparcv9+vis2/sparcv9/libclib_jiio.so: symbolic link too long

real    3m43.81s
user    0m12.74s
sys     1m24.24s
v4u-880k-gmp03 518 # time zfs destroy tank/test@6

real    1h6m44.70s
user    0m0.02s
sys     1h6m44.10s
v4u-880k-gmp03 519 #

Tracing this with dtrace shows all the time being spent in this loop:

	for (zp = list_head(&zfsvfs->z_all_znodes); zp; zp = nextzp) {
		nextzp = list_next(&zfsvfs->z_all_znodes, zp);
		if (zp->z_dbuf_held) {
			/* dbufs should only be held when force unmounting */
			zp->z_dbuf_held = 0;
			mutex_exit(&zfsvfs->z_znodes_lock);
			dmu_buf_rele(zp->z_dbuf, NULL);
			/* Start again */
			mutex_enter(&zfsvfs->z_znodes_lock);
			nextzp = list_head(&zfsvfs->z_all_znodes);
		}
	}

The list contains about 300,000 entries and each one has z_dbuf_held set. Hence this loop is iterated about 300,000*(300,000/2) times.

You don't actually have to destroy the file snapshot  to reprodce this. Doing

umount -f /tank/test/.zfs/snapshot/6

has the same issue.

You don't see the problem on snapshots that have not been accessed or on file systems. Indeed even if a file system has a mounted snapshot that has been accessed, which would be slow to unmount, unmointing the file system (tank/test in this case) which implies an unmount of /tank/test/.zfs/snapshot/6 is fast:

v4u-880k-gmp03 524 # zfs  snapshot tank/test@6                       
v4u-880k-gmp03 525 # time tar cf /dev/null /tank/test/.zfs/snapshot/6
tar: /tank/test/.zfs/snapshot/6/usr/jdk/instances/jdk1.5.0/jre/lib/sparc/cpu/sparcv9+vis/sparcv9/libclib_jiio.so: symbolic link too long
tar: /tank/test/.zfs/snapshot/6/usr/jdk/instances/jdk1.5.0/jre/lib/sparc/cpu/sparcv9+vis2/sparcv9/libclib_jiio.so: symbolic link too long

real    3m19.31s
user    0m12.59s
sys     0m56.69s
v4u-880k-gmp03 526 # pwd
/
v4u-880k-gmp03 527 # time umount /tank/test

real    0m2.90s
user    0m0.01s
sys     0m2.88s
v4u-880k-gmp03 528 #
Work Around
cd into the mountpoint of the filesystem for which this snapshot is being deleted and then attempt to unmount the file system. The unmount will fail as the file system is busy but the subsequent unmount or destroy of 


v4u-880k-gmp03 519 # zfs  snapshot tank/test@6                       
v4u-880k-gmp03 520 # time tar cf /dev/null /tank/test/.zfs/snapshot/6
tar: /tank/test/.zfs/snapshot/6/usr/jdk/instances/jdk1.5.0/jre/lib/sparc/cpu/sparcv9+vis/sparcv9/libclib_jiio.so: symbolic link too long
tar: /tank/test/.zfs/snapshot/6/usr/jdk/instances/jdk1.5.0/jre/lib/sparc/cpu/sparcv9+vis2/sparcv9/libclib_jiio.so: symbolic link too long

real    3m18.98s
user    0m12.62s
sys     0m56.57s
v4u-880k-gmp03 521 # (cd $(zfs list -H -o mountpoint  tank/test) && umount $(/bin/pwd) )
cannot unmount '/tank/test': Device busy
v4u-880k-gmp03 522 # time zfs destroy tank/test@6                    

real    0m0.18s
user    0m0.01s
sys     0m0.02s
v4u-880k-gmp03 523 #
From the customer using the work-around:

"Yes this works, but you might want to document the work-around to
indicate that if the filesystem is shared, the umount seems to
make it unshared and that a "zfs share filesystem" command need to
be executed.  I found out the hard way."
From CR 6537472

Use this script to unmount 

#!/bin/ksh -p
zfs unmount $1 || [[ $(zfs get -Ho value sharenfs $1) == "off" ]] || zfs share $1
Workaround (same as above, but little refined)
----------
Before you do the 'zfs destroy <snapshot>' operation, do below
mentioned steps:

1) 'cd' to the mountpoint of the filesystem.
2) 'unmount' the filesystem. This will fail as "Device busy". Ignore
    the error message.

For example, assume you have a zfs file system 'foo' in zpool 'tank'
and a snapshot 'weekly'.

# zfs list -t snapshot
NAME              USED  AVAIL  REFER  MOUNTPOINT
tank/foo@weekly   22.0M -      79.0M  -
#

To destroy the above snapshot 'tank/foo@weekly' do like this:

# pwd
/
# cd /tank/foo
# umount /tank/foo
cannot unmount '/tank/foo': Device busy
# cd -
# pwd
/
# zfs destroy tank/foo@weekly
#

Measure the time taken for 'zfs destroy' and compare with the results
without this workaround. And let me also know whether is there any
improvement.
Comments
N/A