Technologies for the ProLiant ML570 G3 and ProLiant DL580 G3 Servers Technology Brief
Demand scrubbing allows the chipset to write back good data on a memory read if a correctable
memory error is detected. If future reads occurred at that same memory location without the data
being scrubbed, the chipset would detect another correctable error, which may result in the system
marking the DIMM as degraded. For soft errors, demand scrubbing will prevent all subsequent
correctable errors after the first error is encountered. Demand scrubbing also reduces the likelihood of
another soft error occurring, resulting in a multi-bit error. Multi-bit errors cause a system failure if Hot
Plug Mirrored Memory or Hot Plug RAID is not enabled on the system.
The ProLiant ML570 G3 and ProLiant DL580 G3 servers support demand scrubbing in system ROMs
dated after Feb 28, 2005.
High-availability memory technologies
The ProLiant ML570 G3 and DL580 G3 servers offer three levels of Advanced Memory Protection that
provide increased fault tolerance for applications requiring higher levels of availability: Online Spare;
Hot Plug Mirrored Memory; and Hot Plug RAID.
Hot-plug definitions
As already mentioned, advanced ECC supports hot-add of memory boards so that the amount of
memory available to the OS is increased while the server is running.
Hot-replace, on the other hand, allows a memory board to be removed, the failed or degraded
DIMMs to be replaced, and the memory board to be re-installed, all while the server is running. It is
available without any OS support and can be used with either mirroring or RAID techniques.
Online Spare
With Online Spare mode, when a server DIMM exceeds a threshold rate of correctable memory
errors, that rank of memory within the DIMM that has exceeded the threshold is taken offline and the
XMB memory controller copies the data to a replacement rank (the Online Spare). Because a DIMM
that has a high rate of correctable memory errors is at an increased risk of having an uncorrectable
memory error, Online Spare allows the user to remove these higher-risk DIMMs from the memory
map. Using Online Spare reduces the chance of an uncorrectable error bringing down the system;
however, it does not fully protect the system against uncorrectable memory errors.
When a system uses Online Spare memory, the Online Spare rank must be at least as large as all
other memory ranks on the memory board. In Online Spare mode, one rank of memory per memory
board is reserved for the spare rank and is not available to the OS. For a memory board containing
varying sizes of DIMMs, the system chooses the largest rank on the memory board as the Online
Spare.
Online Spare works independently for each memory board. In other words, each board can copy
data to its Online Spare rank independent of what is happening with any other memory board. It is
supported with any number of memory boards installed.
Online Spare memory does not support any hot-plug operations. While the server must still be
powered down to replace a bad memory module, the server can continue to operate until a
scheduled shutdown.
Hot Plug Mirrored Memory
Mirrored memory mode is a fault-tolerant memory option that provides a higher level of availability
than Online Spare memory. Mirrored memory allows the server to keep two copies of all memory
data on separate memory boards. This allows the system to be protected against uncorrectable
memory errors. If an uncorrectable error is encountered, then the server automatically retrieves correct
data from the memory board that does not contain errors.
13