bustos 2006-08-22
I tried to snapshot a pool with zfs snapshot -r. It failed because it said a
contained filesystem was busy. The filesystem in question was unmounted, a
clone, and had a legacy mountpoint.
DTrace indicated that zil_suspend() was failing with EBUSY, which according to
the code means that the filesystem had an unplayed log. zdb -ivvvv confirmed
this. I cleared the log by mounting and unmounting the filesystem, and the
snapshot command succeeded.
The unplayed log is probably from when I had the filesystem mounted in a zone
and took a crash dump.
Either this scenario should be eliminated (if we can automatically play logs),
or the error message should be improved (by indicating that the log is not
clear, and possibly how to clear it).
I've hit this too. On OpenSolaris, this is particularly a problem
because you wind up with lots of BEs in this state.
I had to manually mount and umount 20 different filesystems to
make this work. Marking NEW. This needs attention, because we're
planning to recommend using 'zfs snapshot -r' on opensolaris as
part of P2V (physical to virtual) conversion of systems into
containers. If we can't rely on zfs snapshot, this will be a
major customer dissatisfier.
This bug has racked up 13 SRs. I am marking NEW as a result of the
fact that this bug is going to make implementing higher level
functionality (the opensolaris zones P2V project). According
to bug 6808530, this is also problematic for auto-snapshot
functionality.
I think it therefore qualifies as something which needs to be
evaluated by the release teams. I urge the ZFS team to prioritize
a fix for this problem.
Work Around
bustos 2006-08-22
Mount and unmount the problematic filesystem.