Disk Load Balancing, Fault Tolerance, and Configuration Limits for NonStop Systems

3
NB-series systems, characterized by processors and ServerNet switches in a c-7000 enclosure and
running J-series RVUs.
NB-series I/O consists of CLIMs connected to ESS (HP XP) and MSA70 disk enclosures containing
SAS disks. The I/O buses are Fibre Channel and SAS.
NB-series systems support a backward-compatible I/O generation by replacing some CLIMs with
IOAME enclosures.
NB-series systems support another backward-compatible I/O generation by replacing some
CLIMs with S-series I/O enclosures.
Processors and memory
For fault tolerance, an IOP should run in two processors which don't share any packaging. In S-series
systems running G06.22 or older RVUs, the halves of an IOP had to run in processors in the same S-
series processor enclosure. Starting with the G06.23 RVU, this constraint was eliminated. Best fault
tolerance is achieved when the loss of one processor enclosure cannot prevent access to both halves
of the IOP, but this can increase the ServerNet path length and affect performance, so a reasonable
compromise is to split each IOP between two processor enclosures which are directly connected to
each other. S-series processor enclosure connectivity is described in the following ServerNet section.
In an NS16x00 system, the processors are packaged in processor complexes. Logical processors 0,
1, 2, and 3 share a processor complex. Processors 4-7 share a complex, processors 8-11 share a
complex, and processors 12-15 share a complex. In other NS-series systems, processors are rack
mounted pairwise in processor modules. Processors 0 and 2 share a module, as do processors 1/3,
4/6, 5/7, 8/10, 9/11, 12/14, and 13/15.
In all system types, the $SYSTEM IOP must run in processors 0 and 1 irrespective of any common
packaging.
In S-series systems, disks, adapters, and processors are packaged together, so the SCF ADD DISK
command provides load balanced default values when processors are not specified. In NS-series and
NB-series systems, processor packaging is decoupled from I/O packaging so it is necessary to specify
processors in the SCF ADD DISK command to spread the IOPs across the available processors.
For load balancing, half of the IOPs configured to use a processor should use it as a PRIMARYCPU
and the other half should use it as a BACKUPCPU. If the configuration of PRIMARYCPUs and
BACKUPCPUs in the CONFIG file is unbalanced and it is not convenient to stop the IOPs and change
the configuration, but it is possible to swap the current primary and backup IOP halves online with
SCF PRIMARY. SCF PRIMARY only affects the running IOP. It does not change the configured
PRIMARYCPU or BACKUPCPU.
The size of physical memory imposes a practical limit on how many IOPs can run in a processor.
Each DP2 IOP half, whether primary or backup, requires a minimum amount of memory to support
any function. There is also a practical minimum which is the amount of memory needed to efficiently
support the set of applications using the volume. This latter characteristic is a performance issue and is
not covered here, except to say that the optimal memory configuration is typically obtained by
varying DP2 attributes until the desired performance is reached.
For the purpose of estimating memory needs, one can first assume that each DP2 process pair
requires the same amount of memory in each processor. This is not always the case, but is a good
approximation. This assumption also addresses the memory needs should the primary processor fail
and all disks are then primaried to the remaining processor.
Note that the best processor loading is typically achieved by configuring half for primary and half for
backup. In some environments a different mix might yield the best performance. Many customers alter
the primary/backup mix during the day to address different loads.