Users Guide

A PERC battery that is suspected to have failed or has a warning symbol displayed in OpenManage Server Administrator should have a

manual Learn Cycle performed. A Learn Cycle causes the battery to discharge and recharge, and restores the battery to a fully

functional condition. In some cases, multiple Learn Cycle procedures may be required to restore the battery to an effectively charged

state. To perform a manual Learn Cycle, select Start Learn Cycle from the Battery Tasks drop-down menu in Open Manage Server

Administrator (OMSA).

• Cache Use

Hardware RAID controllers utilize cache (a temporary repository of information) for their normal operation. The normal operation

cache comprises DRAM memory, which, like system memory, retains data only when powered on.

Newer controllers utilize NVCache, which is utilized when the server is powered off. NVCache memory contains both DRAM memory

(for normal operation) and flash memory (non-volatile). The controllers battery (if operational) powers the DRAM memory during a

power loss so that the contents can be copied into the flash memory for indefinite storage.

The contents of cache can essentially be broken into three parts:

• RAID configuration and metadata - Information about the RAID arrays including configuration information, disk members, role of disks,

etc.

• Controller logs - RAID controllers maintain several log files. Dell technicians rely on the TTY log as the primary log for troubleshooting

various RAID and hard drive issues.

• RAID data - This is the actual data destined to be written to the individual hard drives. Data is written into the cache of the controller

in both Write Through and Write Back cache policy modes.

Slicing

Configuring multiple RAID arrays across the same set of disks is called Slicing.

RAID puncture

A RAID puncture is a feature of Dell PowerEdge RAID Controller (PERC) designed to allow the controller to restore the redundancy of the

array despite the loss of data caused by a double fault condition. Another name for a RAID puncture is rebuild with errors. When the RAID

controller detects a double fault and there is insufficient redundancy to recover the data in the impacted stripe, the controller creates a

puncture in that stripe and enables the rebuild to continue.

• Any condition that causes data to be inaccessible in the same stripe on more than one drive is a double fault.

• Double faults cause the loss of all data within the impacted stripe.

• All RAID punctures are double faults but all double faults are NOT RAID punctures.

Causes of RAID puncture

Without the RAID puncture feature, the array rebuild would fail, and leave the array in a degraded state. In some cases, the failures may

cause additional drives to fail, and cause the array to be in a non-functioning offline state. Puncturing an array has no impact on the ability

to boot to or access any data on the array.

RAID punctures can occur in one of two situations:

• Double Fault already exists (Data already lost).

Data error on an online drive is propagated (copied) to a rebuilding drive.

• Double Fault does not exist (Data is lost when second error occurs).

While in a degraded state, if a bad block occurs on an online drive, that LBA is RAID punctured.

Troubleshooting hardware issues