Disk Load Balancing, Fault Tolerance, and Configuration Limits for NonStop Systems

7
I/O buses
The most common issue at this level in the architecture is bandwidth. For all bus types (SCSI, Fibre
Channel, and SAS domains), multiple devices share the bandwidth of a single backplane, cable, or
expander. For this reason, it is best to spread the devices evenly across the available I/O buses.
An I/O bus is also a potential point of failure, so the primary and mirror drives of a mirrored disk
volume should not be configured on the same I/O bus. All of the disks in a 45xx disk module share
one I/O bus. All of the disks in an FCDM disk module share two Fibre Channel loops. All of the disks
in an MSA70 enclosure share two SAS domains. Half of the disks (in the odd numbered disk slots) in
an S-series processor or I/O enclosure share one SCSI bus. The even numbered disk slots share the
other SCSI bus.
A Fibre Channel loop through a daisy chain of FCDM disk modules or a SAS domain through a daisy
chain of MSA70 disk enclosures should also be considered as a potential point of failure, so the
primary and mirror drives of a mirrored disk volume should not be configured on the same daisy chain.
A failure or service procedure in one member of a daisy chain (either FCDMs or MSA70s) can isolate
all of the enclosures further down the daisy chain, so the alternate path (that is, the other Fibre
Channel loop or SAS domain) should be connected to the opposite end of the daisy chain so that
failure or removal of one disk enclosure does not eliminate both paths to any other disk enclosure.
A Fibre Channel SAN fabric should be considered as a potential point of failure because the entire
fabric can pause long enough to cause disk I/O to time out during fabric reconfiguration. For best
fault tolerance, all disk -P and -MB paths should use a SAN fabric which is separate from the SAN
fabric used by all disk -B and -M paths.
Examples
The first example shows 16 mirrored FCSA-attached disk volumes load balanced in every way.
Across 4 processors
4 IOPs primary in each processor
4 IOPs backup in each processor
Across 2 ServerNet fabrics (based on FCSA affinity)
32 configured paths through each fabric
16 active paths through each fabric
Across 8 SACs (4 FCSAs*2 SACs on each FCSA)
8 configured paths through each SAC
4 active paths through each SAC
Across each combination of processor and SAC
4 configured paths from each processor to each SAC
2 active paths from each processor to each SAC
1 active path from each primary processor to each SAC