Installation Guide - HP AD397A rx2660 SAS Smart Array P400 Controller

Fault Management
Fault Management Features
Appendix B
30
Fault Management Features
The Smart Array P400 Controller supports several fault management and data reliability features that
minimize the impact of disk drive defects on your systems.
Auto-Reliability Monitoring (ARM)
A firmware process that operates in the background, scanning physical disks for bad sectors
in fault-tolerant logical drives. ARM also verifies the consistency of parity data in logical
drives that use RAID 5 or RAID ADG. This process assures that you can recover all data
successfully if a disk fails. ARM operates only when you select a fault-tolerant
configuration.
Dynamic sector repair
Automatically remaps any sectors that have media faults detected either during normal
operation or by auto reliability monitoring.
S.M.A.R.T.
An industry-standard diagnostic and failure prediction feature of physical disks, developed
by HP in collaboration with the disk drive industry. S.M.A.R.T. monitors several factors that
predict imminent physical disk failure due to mechanical causes. These include the
condition of the read/write head, the seek error rate, and the spin-up time. When a
threshold value is exceeded for one of these factors, the disk sends an alert to the controller
that failure is imminent. Thus, you can back up data and replace the disk drive before
failure occurs.
NOTE An online spare does not become active and start rebuilding when an
imminent failure alert is sent, because the degraded disk has not failed yet
and is still online. The online spare is activated only after a disk in an array
has failed.
Drive failure alert features
Send an alert message to Event Monitoring Services (EMS) when a physical disk or a logical
drive fails.
Interim data recovery
Occurs if a disk fails in a fault-tolerant configuration.
Recovery ROM
A redundancy feature that ensures continuous system availability by providing a backup
ROM. This feature protects against corruption of a ROM image (for example, by power
fluctuation during ROM upgrade). If corruption occurs, the server automatically restarts
using the remaining good copy of the ROM image. When you upgrade the ROM, the inactive
image (the one not being used by the system) is upgraded. There is not normally any
noticeable difference in operation. When you use Recovery ROM for the first time, however,
both ROM images are upgraded, causing a boot delay of about 60 seconds.