Intel RAID Controllers - Best Practices white paper

Intel® RAID Controllers Best Practices White Paper
Revision 1.0
11
8. Basic Troubleshooting
Some basic troubleshooting information is provided below for your reference.
Note: Before attempting to diagnosis RAID failures or make any changes to the RAID
configuration, please confirm that a complete and verified backup of critical data is available. A
verified backup exists when the backed up data has been compared against the original data.
Note: If you encounter a drive failure or an offline drive, do not remove any drives from the
system (hot plug) or shut the system down until you have verified the cause of the failure.
Contact Intel Customer Support if you have any questions.
8.1 Drive State Definition
The SAS Software Stack firmware defines the following states for physical disks connected to
the controller:
Unconfigured Good – A disk accessible to the RAID controller but not configured as a
part of a virtual disk. For example, a new drive inserted into a system.
Online – A disk accessible to the RAID controller and configured as part of a virtual disk.
Failed – A disk drive that is part of a virtual disk, but has failed and is no longer usable.
Rebuild – A disk drive to which data is being written to restore full redundancy to a virtual
disk.
Unconfigured Bad – A disk drive that is no longer part of an array and is known to be
bad. This state is typically assigned to a drive that has failed, but is no longer part of a
configured virtual disk because it has been replaced by a Hot-Spare drive.
Foreign – When a disk has configuration information on the drive (metadata) that is not
in the NVRAM of the controller, it is considered “Foreign”, When importing disks from a
different RAID controller (foreign metadata), the physical disk is marked as foreign until
user action is taken to add the configuration on the disks to the existing configuration in
the NVRAM on the controller. Foreign is not actually a drive state, but rather it indicates
that a drive is from another configuration. Foreign drives are typically in an unconfigured
good state until they are imported into the current configuration. For example, when a
system is powered on with drives from another system that contain the RAID
configuration information, they are considered “foreign” until they have been accepted or
declined as part of the current configuration.
Hot spare – A disk drive that is defined as a hot spare. A hot spare is used to
automatically come online and replace the first failed drive in a virtual disk. A hot spare
will only come online if it is the same size or larger than the failing drive, and if a drive
has been marked as failed.
Offline – A disk drive that is still part of a configured Virtual Disk Drive, but which is not
active. This state is used to represent a configured drive for which the data is not valid.
This state can occur as a transition state or due to a user action.
8.2 Virtual Disk State Description
Optimal – A virtual disk with member drives that are online.