HP Smart Array 6400 Series Controllers Support Guide, September 2007
If the Online LED of the replacement disk drive stops blinking during automatic data recovery,
there are three possible causes:
• If the Online LED is glowing continuously, automatic data recovery is successful and finished.
• If the Fault LED is illuminated or other LEDs go out, the replacement disk has failed and is
producing unrecoverable disk errors. Remove and replace the failed replacement disk.
• If the automatic data recovery process has abnormally terminated, one possible cause is a
noncorrectable read error on another physical disk. Locate the faulty disk, replace it, and
restore data from backup.
Physical Disk Replacement Overview
CAUTION: A disk that was previously failed by the controller can seem to be operational after
the system is power-cycled, or (for a hot-pluggable drive) after the drive has been removed and
reinserted. However, continued use of such marginal drives can result in data loss. Replace all
marginal drives as soon as possible.
Consider these factors when replacing a disk:
• Non-hot-pluggable drives can only be replaced while the system is powered off.
• Hot-pluggable disks can be removed and replaced at any time.
• When a hot-pluggable disk is inserted, all disk activity on the array pauses while the new
drive is spinning up (usually 20 seconds). If the disk is inserted when power is on in a
fault-tolerant configuration, data recovery onto the replacement drive begins automatically,
indicated by the blinking Online LED.
• Replacement disks must have a capacity no less than that of the smallest disk in the array.
Disks with insufficient capacity are failed immediately by the controller, before automatic
data recovery begins.
IMPORTANT: In systems using external data storage, be sure that the server is the first unit to
be powered off and the last to be powered back on. This ensures that the system will not
erroneously mark the drives as failed.
The rebuild operation takes several hours, even if the system is not busy while the rebuild is in
progress. System performance and fault tolerance are both affected until the rebuild has finished.
Therefore, replace disks during low activity periods whenever possible. In addition, be sure that
all logical drives on the same array as the disk being replaced, have a current, valid backup.
Physical Disk Failure During Rebuild
If another disk in the array fails when fault tolerance is unavailable during rebuild, a fatal system
error can occur. If this happens, all data on the array is lost. In exceptional cases, however, failure
of another disk does not lead to a fatal system error. These exceptions include:
• Failure after activation of a spare disk
• Failure of a disk that is not mirrored to any other failed disk (in a RAID 1+0 configuration)
• Failure of a second disk in a RAID ADG configuration
Minimizing Fatal System Errors During Rebuild
When a physical disk is replaced, the controller gathers fault tolerance data from the remaining
disks in the array. This data is then used to rebuild the missing data from the failed disk onto
the replacement disk. If more than one disk is removed at a time, the fault tolerance data is
incomplete. The missing data cannot then be reconstructed and is likely to be permanently lost.
Physical Disk Replacement Overview 71










