OpenSolaris

Printable Version Enter a New Search
Bug ID 6520519
Synopsis ZFS should diagnose faulty devices
State 10-Fix Delivered (Fix available in build)
Category:Subcategory kernel:zfs
Keywords
Responsible Engineer Eric Schrock
Reported Against
Duplicate Of
Introduced In
Commit to Fix snv_68
Fixed In snv_68
Release Fixed solaris_nevada(snv_68) , solaris_10u6(s10u6_01) (Bug ID:2156299)
Related Bugs
Submit Date 1-February-2007
Last Update Date 3-July-2007
Description
Currently, ZFS does not do any diagnosis of drives which are
pathologically broken (i.e. continually returning I/O or 
checksum errors).  This results in a particularly bad exprerience
when a device goes out to lunch, because ZFS will continue to
issue I/O even though it never comes back.  This brings the
entire pool to a halt, and the user sometimes cannot even make
forward progress to determine what has gone wrong.

The ZFS diagnosis engine needs to listen to I/O and checksum
ereports and make an intelligent diagnosis.  Note that this 
will necessitate a new vdev state, VDEV_STATE_FAULTED, which
indicates an external request to fault the device.  This state
will be persistent, and there will have to be some careful
negotiation between fmd's resource cache and ZFS over repair
actions.
Work Around
N/A
Comments
N/A