Rolling Reload of NonStop Kernel Processors

Rolling Reload of NonStop Kernel Processors 2/18/04
Hewlett-Packard Development Company L.P. 6 of 7 CSSI Website
Perform similar checks for applications and other subsystems such as
ServerNet/FX and Telco-specific SACs.
If necessary, verify the health of the applications.
Verify that the time since the halt is about what was determined in the planning
phase (Section 2.5, Determine how much time is required for a rolling reload).
4.3 Execute update step(s), if needed before reload.
You might perform a rolling reload to install new software or firmware on the processors.
That update must now be applied to the halted processor. In many cases, the reload
operation in Section 4.4 automatically installs the new software, so this step is often not
needed.
4.4 Reload the halted processor.
Use
RELOAD <n>, PRIME,
or use the processor OSM Reload action. Use of the
PRIME” option forces a new copy of boot millicode to be loaded. This load is normally
not needed, but takes less than one minute and ensures a consistent processor reset and
copy of millicode.
Note: Do not specify the ServerNet fabric for the reload. (The X fabric will be tried
first if no fabric is specified.)
Note: Sometimes, switching from the primary process to the backup process in a
NonStop process pair is costly in terms of system resources or system response
time, even though the backup process is ready.” For a system with heavy load
and tight response-time requirements, the following steps might be needed to
avoid causing an application outage while all the disks re-primary and the disk
process preloads very large disk caches.
(1) Use the “
RELOAD <n>, PRIME, NOSWITCH
option to reload the
halted processor.
(2) Switch the disk processes one at a time, limiting the impact of the disk
cache reload because only one cache at a time is reloaded.
(3) You might also need to pre-switch the disk processes one at a time for this
type of system prior to halting a processor.
Verify that no exceptions are listed in the command-line RELOAD response text.
All system resources should be re-integrated as reported by the RELOAD
response text.
Wait a minimum of one minute before proceeding with the next step.
Verify that processor utilization (measured by ViewSys or other up system
management application) stabilizes.
Verify that the event flood triggered by halting a processor has receded by
checking events in the two event logs ($0 and $ZLOG) using the “real-time”
setting.
Caution: All subsystems should stop generating events indicating they
have no backup. Wait until these events cease.
Verify that
SCF STATUS SERVERNET $ZSNET
shows no problems.
Verify that
SCF STATUS SAC $ZZLAN.*,DETAIL
shows no problems.