OpenSolaris

Printable Version Enter a New Search
Bug ID 6509807
Synopsis ZFS checksum ereports are not being posted
State 10-Fix Delivered (Fix available in build)
Category:Subcategory kernel:zfs
Keywords
Responsible Engineer Eric Schrock
Reported Against
Duplicate Of
Introduced In solaris_nevada
Commit to Fix snv_68
Fixed In snv_68
Release Fixed solaris_nevada(snv_68) , solaris_10u6(s10u6_01) (Bug ID:2156288)
Related Bugs
Submit Date 4-January-2007
Last Update Date 29-April-2008
Description
While debugging an unrelated problem, I noticed that we were seeing
checksum errors, but 'fmdump -e' wasn't showing any related
ereports.  After some dtracing, I found that zfs_ereport_post() is
correctly being called, but that we're erroneously ignoring the
errors.  In particular, zfs_ereport_post() has the following logic:

        /*
         * Ignore any errors from I/Os that we are going to retry anyway - we
         * only generate errors from the final failure.
         */
        if (zio && zio_should_retry(zio))
                return;

For checksum errors, we generate the ereport is zio_checksum_verify(),
which occurs _after_ the zio_io_assess() stage that normally issues the
retry.  Assuming that this is the intended behavior (to not retry
checksum errors), then zfs_ereport_post() is making an invalid assumption
that the given io will be retried later.

Note that this only affects unreplicated pools.  Otherwise, the checksum
errors will appear at the leaf vdev, and the 'vd != vdev_top' check in
zio_should_retry() let us through zfs_ereport_post().
Work Around
N/A
Comments
N/A