Dynamic Reconfiguration (DR) User’s Guide SPARC Enterprise M4000 / M5000 / M8000 / M9000 Servers English
SPARC® Enterprise M4000/M5000/M8000/M9000 Servers Dynamic Reconfiguration (DR) User's Guide Order No. U41684-J-Z816-2-76 Part No.
Copyright 2007 FUJITSU LIMITED, 1-1, Kamikodanaka 4-chome, Nakahara-ku, Kawasaki-shi, Kanagawa-ken 211-8588, Japan. All rights reserved. Sun Microsystems, Inc. provided technical input and review on portions of this material. Sun Microsystems, Inc.
Copyright 2007 FUJITSU LIMITED, 1-1, Kamikodanaka 4-chome, Nakahara-ku, Kawasaki-shi, Kanagawa-ken 211-8588, Japon. Tous droits réservés. Entrée et revue tecnical fournies par Sun Microsystems, Incl sur des parties de ce matériel. Sun Microsystems, Inc. et Fujitsu Limited détiennent et contrôlent toutes deux des droits de propriété intellectuelle relatifs aux produits et technologies décrits dans ce document.
Contents Preface 1. 2. xiii Overview of Dynamic Reconfiguration 1.1 DR 1.2 Basic DR Functions 1–1 1–1 1–5 1.2.1 Adding a System Board 1.2.2 Deleting a System Board 1.2.3 Moving a System Board 1.2.4 Replacing a System Board 1.3 Security 1.4 Overview of DR User Interfaces 1–6 1–6 1–6 1–7 1–7 1–7 What You Must Know Before Using DR 2.1 System Configuration 2.1.1 2–1 2–1 System Board Components 2.1.1.1 CPU 2.1.1.2 Memory 2.1.1.3 I/O Device 2–1 2–4 2–5 2–9 2.1.
2.2 2.3 2.4 2.5 vi 2.1.4 Checklists for System Configuration 2.1.5 Reservation of Domain Configuration Changes Conditions and Settings Using XSCF 2.2.1 Conditions Using XSCF 2.2.2 Settings Using XSCF 2–11 2–12 2–12 2–13 2.2.2.1 Configuration Policy Option 2.2.2.2 Floating Board Option 2–14 2.2.2.3 Omit-memory Option 2–15 2.2.2.4 Omit-I/O Option 2–13 2–15 Conditions and Settings Using Solaris OS 2–16 2.3.1 I/O and Software Requirements 2–16 2.3.
3. XSCF Failover 2.5.7 Kernel Memory Board Deletion 2.5.8 Deletion of Board with DVD Drive DR User Interface 3.1 4. 2.5.6 2–29 2–30 3–1 How To Use the DR User Interface 3–1 3.1.1 Displaying Domain Information 3.1.2 Displaying Domain Status 3.1.3 Displaying System Board Information 3.1.4 Displaying Device Information 3.1.5 Displaying System Board Configuration Information 3.1.6 Adding a System Board 3.1.7 Deleting a System Board 3.1.8 Moving a System Board 3.1.
4.5 4.6 Examples: Replacing a System Board 4.5.1 Example: Replacing a Uni-XSB System Board 4.5.2 Example: Replacing a Quad-XSB System Board A.2 4.6.1 Example: Reserving a System Board Add 4.6.2 Example: Reserving a System Board Delete 4.6.3 Example: Reserving a System Board Move Solaris OS Messages Transition Messages A.1.2 PANIC Messages A.1.3 Warning Messages viii 4–20 4–22 4–23 A–1 A–3 A–4 A–23 A.2.1 addboard A.2.2 deleteboard A–26 A.2.3 moveboard A–28 A.2.4 setdcl A.2.
Figures FIGURE 1-1 Uni-XSB and Quad-XSB (Midrange Servers) 1–2 FIGURE 1-2 Uni-XSB and Quad-XSB (High-end Servers FIGURE 1-3 DR Processing Flow FIGURE 2-1 Example of Hardware Configuration (with Uni-XSB of Midrange Server) FIGURE 2-2 Example of Hardware Configuration (with Quad-XSBs of Midrange Server) 2–3 FIGURE 2-3 Example of a Hardware Configuration (with Uni-XSBs of High-end Server) 2–4 FIGURE 2-4 Example of a Hardware Configuration (with Quad-XSBs of High-end Server) FIGURE 2-5 Flow o
FIGURE 4-10 Example: Reserve a System Board Add FIGURE 4-11 Example: Reserving a System Board Delete FIGURE 4-12 Example: Reserving a System Board Move x 4–20 4–22 4–23 SPARC Enterprise Mx000 Servers Dynamic Reconfiguration User’s Guide • September 2007
Tables TABLE 1-1 Basic DR Terms 1–3 TABLE 1-2 Terms Related to Hardware Configurations TABLE 2-1 Unit of Degradation TABLE 2-2 Domain Status TABLE 2-3 System Board Management Items 2–18 TABLE 2-4 System Board Management Items 2–19 TABLE 3-1 DR Display Commands TABLE 3-2 DR Operation Commands TABLE 3-3 Options of the showdcl Command TABLE 3-4 Items of Domain Information to be Displayed TABLE 3-5 Options of the showdomainstatus Command TABLE 3-6 Items of Domain Information to be Dis
TABLE 3-15 Options of the moveboard Command TABLE 3-16 DR Display Commands TABLE 3-17 DR Operation Commands TABLE 3-18 DR-related Commands xii 3–20 3–25 3–25 3–26 SPARC Enterprise Mx000 Servers Dynamic Reconfiguration User’s Guide • September 2007
Preface This manual describes the Dynamic Reconfiguration (DR hereafter) function provided by SPARC Enterprise servers. This manual is intended for users, specifically system management administrators who conduct operations on systems and domains.
Audience This manual is intended for users, who administrate SPARC Enterprise M4000/M5000/M8000/M9000 servers (hereinafter referenced to as XSCF user). The XSCF user is required to have the following knowledge: ■ ■ SolarisTM Operating System and Unix command SPARC Enterprise M4000/M5000/M8000/M9000 servers and basic knowledge of XSCF Structure and Contents of This Manual This manual is organized as described below: ■ Chapter 1 Overview of Dynamic Reconfiguration This chapter provides an overview of DR.
Glossary and Index ■ Glossary The glossary explains the terms used in this manual ■ Index The index provides keywords and corresponding reference page numbers so that the reader can easily search for items in this manual as necessary. SPARC Enterprise Mx000 Servers Documentation The manuals listed below are provided for reference.. Book Titles Order No.
Book Titles Order No.
Contact the field engineer. The following files or document are provided. i. Firmware program file (XSCF Control Package (XCP) file) ii. XSCF extension MIB definition file Note – XSCF Control Package (XCP) : XCP is a package which has the control programs of hardware that configures a computing system. The XSCF firmware and the OpenBoot PROM firmware are included in the XCP file. c. Fault Management MIB (SUN-FM-MIB) definition file http://src.opensolaris.
Models The model names used in this manual are as follows. Server class Model name Midrange SPARC Enterprise M4000 SPARC Enterprise M5000 High-end SPARC Enterprise M8000 SPARC Enterprise M9000 Text Conventions This manual uses the following fonts and symbols to express specific types of information. xviii Fonts/symbols Meaning Example AaBbCc123 What you type, when contrasted with on-screen computer output. This font represents the example of command input in the frame.
Prompt Notations The prompt notations used in this manual are as follows. Shell Prompt Notations XSCF XSCF> C shell machine-name% C shell super user machine-name# Bourne shell and Korn shell $ Bourne shell and Korn shell super user # OpenBoot PROM ok Syntax of the Command Line Interface (CLI) The command syntax is described below. Command syntax The command syntax is as follows: ■ A variable that requires input of a value must be enclosed in <>.
■ The command syntax is shown in a frame such as this one. Example:: XSCF> showuser -a Software License The function to explain in this manual uses the softwares of GPL,LGPL and others. For the information of the license, see Appendix E, "Software License Condition" in SPARC Enterprise M4000/M5000/M8000/M9000 Servers XSCF User’s Guide. Fujitsu Siemens Computers Welcomes Your Comments We would appreciate your comments and suggestions to improve this document.
Reader's Comment Form Preface xxi
FOLD AND TAPE NO POSTAGE NECESSARY IF MAILED IN THE UNITED STATES BUSINESS REPLY MAIL FIRST-CLASS MAIL PERMIT NO 741 SUNNYVALE CA POSTAGE WILL BE PAID BY ADDRESSEE FUJITSU COMPUTER SYSTEMS AT TENTION ENGINEERING OPS M/S 249 1250 EAST ARQUES AVENUE P O BOX 3470 SUNNYVALE CA 94088-3470 FOLD AND TAPE xxii SPARC Enterprise Mx000 Servers Dynamic Reconfiguration User’s Guide • September 2007
CHAPTER 1 Overview of Dynamic Reconfiguration This chapter provides an overview of Dynamic Reconfiguration, which is controlled by the eXtended System Control Facility (XSCF). 1.1 DR Dynamic Reconfiguration (referred to as DR, in this document) enables hardware resources such as processors, memory, and I/O to be added and deleted even while the Solaris TM Operating System (referred to as OS, in this document) is running. DR has three basic functions; i.e.
into four boards is called a Quad-XSB. Each composition of physical unit of the divided PSB is called an eXtended System Board (XSB). These XSBs can be combined freely to create domains. DR functions on these servers are performed on an XSB. This manual uses the term system board unless physical units of PSB and XSB are described. For an explanation of each term, see TABLE 1-2. Note – This document explains DR functions on system boards.
FIGURE 1-2 Uni-XSB and Quad-XSB (High-end Servers Uni-XSB Quad-XSB XSB XSB XSB XSB XSB CMU CMU IOU IOU System boards TABLE 1-1 and TABLE 1-2 list DR-related terms. TABLE 1-1 Basic DR Terms Term Definition Add To connect a system board to a domain and configure it into the Solaris OS of the domain. Delete To unconfigure a system board from the Solaris OS of a domain and disconnect it from the domain.
TABLE 1-1 Basic DR Terms Term Definition Unconfigure To unconfigure a system board in the Solaris OS. Reserve To reserve a system board such that it is assigned to or unassigned from a domain on the next reboot or power-cycle. Install To insert a system board into a system. Remove To remove a system board from a system. Replace To remove a system board and then mount it or a new system board, for system maintenance and inspection.
TABLE 1-2 1.2 Terms Related to Hardware Configurations (Continued) Term Definition System board The hardware resources of a PSB or an XSB. A System board is used to describe the hardware resources for operations such as domain construction and identification. In this manual, this refers to the XSB. Uni-XSB One of the division types of a PSB. Uni-XSB is a name for when a PSB is logically only one unit (undivided status). It is a default value setting for the division type for a PSB.
In the example shown in FIGURE 1-3, system board #2 is deleted from domain A and added to domain B. In this way, the physical configuration of the hardware (mounting locations) is not changed but the logical configuration is changed for management of the system boards. 1.2.1 Adding a System Board You can use DR to add a system board to a domain provided that board is installed in the system and not assigned to another domain. You can do so without stopping the Solaris OS running in the domain.
1.2.4 Replacing a System Board You can use DR to remove a system board from a domain and either add it back later, or replace it with another system board, provided both boards satisfy DR requirements as described in this document. You can do so without stopping the Solaris OS running in either domain. You can replace system board in the case of exchanging hardware resources such as CPUs, memory, I/O devices. A system board is replaced successively in stages.
For details of XSCF shell commands provided for DR, see Section 3.1, “How To Use the DR User Interface” on page 3-1. XSCF Web is beyond the scope of this document. See the SPARC Enterprise M4000/M5000/M8000/M9000 Servers XSCF User’s Guide for further information.
CHAPTER 2 What You Must Know Before Using DR This chapter provides information you must know to successfully use the DR functions. 2.1 System Configuration This section describes the conditions, premises, and actions for operating the DR functions to construct a system. 2.1.1 System Board Components There are three types of system board components that can be added and deleted by DR: CPU, memory, and I/O device.
FIGURE 2-1 Example of Hardware Configuration (with Uni-XSB of Midrange Server) CMU IOU Memory I/O device Memory I/O device XSB 00-0 Memory Memory MBU Memory I/O device Memory I/O device XSB 01-0 Memory Memory 2-2 SPARC Enterprise Mx000 Servers Dynamic Reconfiguration User’s Guide • September 2007
FIGURE 2-2 Example of Hardware Configuration (with Quad-XSBs of Midrange Server) CMU IOU XSB 00-0 Memory I/O device XSB 00-1 Memory I/O device XSB 00-2 Memory XSB 00-3 Memory MBU XSB 01-0 Memory I/O device XSB 01-1 Memory I/O device XSB 01-2 Memory XSB 01-3 Memory Chapter 2 What You Must Know Before Using DR 2-3
FIGURE 2-3 Example of a Hardware Configuration (with Uni-XSBs of High-end Server) CMU IOU Memory I/O device Memory I/O device XSB 00-0 Memory Memory FIGURE 2-4 I/O device Example of a Hardware Configuration (with Quad-XSBs of High-end Server) CMU 2.1.1.
A CPU to be deleted must meet the following conditions: ■ No running process is bound to the CPU to be deleted. If a running process is bound to the target CPU, you must unbind or stop the process. ■ The CPU to be deleted does not belong to any processor set. If the target processor belongs to a processor set, you must delete the CPU from the processor set by using the psrset(1M) command.
■ To control whether a system board contains kernel memory, use one or more of the following features, which are described below: kernel cage, floating boards, and kernel memory assginment. ■ To copy kernel memory from one board to another, use the Copy-rename operation. Copy-rename makes it possible for you to perform DR operations on kernel memory boards. (1.1) Kernel Cage The kernel cage function must be in use for DR operations on memory to succeed.
When the kernel cage is enabled, kernel memory is assigned to system boards in the order of their address spaces. The kernel cage begins in the first address space (which initially corresponds to the non-floating board with the lowest LSB number). If the kernel requires more memory, then the kernel cage expands to the next address space (which initially corresponds to the non-floating board with the nextlowest LSB number), and so on.
Once the copy-destination board has been selected, the Solaris OS performs a memory deletion on the selected user memory board. Then, the kernel memory on the system board to be deleted is copied into memory on the selected copy-destination system board. The system is suspended while the copying is in progress. After all the memory is copied, the address space of the copydestination board is renamed to that of the kernel memory board being deleted.
Deleting or moving a user memory board fails if either of the following statements is true: 2.1.1.3 ■ The swap area does not have sufficient free space to save data from the user memory to be deleted. ■ There are too many locked or ISM pages to be covered by the memory on other system boards. I/O Device (1) Adding an I/O Device The device driver processing executed by the Solaris OS is based on the premise that all device drivers dynamically recognize newly added devices.
Note – Do not move a device that is part of a redundant configuration from one domain to another domain. The consequences of two domains simultaneously accessing the same device through different paths could be disastrous, such as data corruption. 2.1.2 System Board Configuration Requirements XSCF enables the Uni-XSB or Quad-XSB setting according to the configuration conditions to determine the division type.
Moreover, a system board that is pooled can be assigned to a domain only when it is registered on DCL. Pooled system boards must be properly managed. You can add and delete system boards by combining the system board pooling function with the floating board, omit-memory, and omit-I/O options described in Section 2.2, “Conditions and Settings Using XSCF” on page 2-12. 2.1.4 Checklists for System Configuration This section describes the prerequisites and the checklists for configuring the system for DR. 1.
When kernel memory is copied, the Solaris OS is temporarily suspended. Therefore, you must understand the effect of disconnecting the network connection with remote systems and other influences of the DR operation on job processes before determining system operations. 2.1.
operation. As a matter of course, system boards to be deleted, moved, or replaced have already been registered in the DCL. You need not confirm that these boards have been registered in the DCL. For details about the DCL and how to register system boards in the DCL and to confirm registration, refer to SPARC Enterprise M4000/M5000/M8000/M9000 Servers XSCF User’s Guide. 2.2.
Note – Enable the configuration policy option when the power supply of the domain is turned off. TABLE 2-1 2.2.2.2 Unit of Degradation Value Unit of degradation FRU Hardware is degraded in units of components such as CPU and memory. XSB Hardware is degraded in units of system boards (XSB). System Hardware is degraded in units of domains or the relevant domain is stopped without degradation. Floating Board Option The floating board option controls kernel memory allocation.
2.2.2.3 Omit-memory Option When the omit-memory option is enabled, the memory on a system board cannot be used in the domain. Even when a system board actually has memory, this option enables you to make the memory on the system board unavailable through a DR operation to add or move the system board. This option can be used when the target domain needs only the CPU (and not the memory) of the system board to be added.
Note – Enable the omit-I/O option when the system board is in the system board pool or when the system board is not connected to the domain configuration. 2.3 Conditions and Settings Using Solaris OS This section describes the operating conditions and settings required for DR operations. 2.3.1 I/O and Software Requirements As described in Section 2.1, “System Configuration” on page 2-1, all I/O device drivers and software installed in a domain where DR is to be used must support DR.
If the kernel cage is disabled, the system may run more efficiently, but kernel memory will be spread among all boards and DR operations will not work on memory. To determine whether kernel cage memory is enabled after the system has been rebooted, check the following message output from the /var/adm/messages file: NOTICE: DR kernel Cage is ENABLED If the kernel cage is disabled, the message will be: NOTICE: DR kernel Cage is DISABLED In most cases the kernel cage should be enabled.
XSCF manages the following aspects of domain status: TABLE 2-2 Domain Status Status Description Powered Off Domain power is off. Initialization Phase POST processing or OpenBoot PROM initialization is in progress. OpenBoot Executing Completed Initialization of OpenBoot PROM is completed. Booting Solaris OS is being booted or, due to the domain being shutdown or reset, the system is in the OpenBoot PROM running state or is suspended in the OpenBoot PROM (ok prompt) state.
The table below lists the status types available for individual management items. TABLE 2-4 System Board Management Items Management item Status Description Power Power Off The system board is powered off and cannot be used. Power On The system board is powered on. unmount The system board is not mounted or cannot be recognized, perhaps because it is faulty. unknown The system board is not being diagnosed.
To perform a DR operation for a system board, you must determine the method of DR operation according to the status of the target system board. You can display and reference the status of each system board via a user interface provided by XSCF. For details of the user interface, see Chapter 3, DR User Interface. 2.4.3 Flow of DR Processing This section describes the flow of DR processing and the changes in system board status during individual DR operations. 2.4.3.
FIGURE 2-5 Flow of System Board Addition Processing DCL registration status System board pool Addition or reservation, DCL registration process Test: passed Assignment: available registration or reservation Test: passed Assignment: assigned Add operation Request to add system board, or domain reboot after registration/reservation Request to add system board Error status Diagnosis Test: testing Assignment: assigned Error found Test: fail Assignment: assigned Diagnosis completed Domain configurat
Each system board status indicated in FIGURE 2-6 is the main status that is changed.
2.4.3.3 Flowchart: Moving a System Board The flow of DR operations and the transition of system board status when a system board has been moved or reserved for a move are described in the schematic flowchart, below. Each system board status indicated in FIGURE 2-7 is the main status that is changed. For the flow of system board addition processing or deletion processing and the related system board status, see Section 2.4.3.1, “Flowchart: Adding a System Board” on page 2-20 or Section 2.4.3.
FIGURE 2-7 Flow of System Board Move Processing Move reservation process Move process Deletion of system board in original domain Reservation to delete system board in original domain Deletion completed Reboot of original domain Process to change domain configuration in original domain Assignment:unavailable Assignment: assigned Connectivity: disconnected Connectivity: disconnected Unassignment Configuration: unconfigured Configuration: unconfigured from domain Configuration change of original domai
2.4.3.4 Flowchart: Replacing System Board The flow of DR operations and the transition of system board status when a system board has been replaced are described using the schematic flowchart. Each system board state indicated in FIGURE 2-8 is the main status that is changed. The sample status before and after replacement as shown in the figure are explained below. The actual status after hardware replacement may not match the indicated status.
FIGURE 2-8 Flow of System Board Replacement Processing Deletion process Deleting a system board Deletion of system boards also from system board pool Request to delete from DCL registration status DCL registration status System board pool Assignment: assigned Assignment: available Replacement process Replacement process Hardware replacement and diagnosis Replacement completed Replacement completed DCL registration status System board pool Test: passed Assignment: assigned Test: passed Assignme
2.5 Operation Management This section describes the premises and the actions for DR operations. 2.5.1 I/O Device Management Upon the addition of a system board, device information is reconfigured automatically. However, addition of the system board and the reconfiguration of device information do not end at the same time. Sometimes, device link in /dev directory is not automatically cleaned up by devfsadmd(1M) daemon. Using devfsadm(1M), you can manually clean up this device link.
memory contents. Be aware that some of the total swap space may be supplied by disks that are attached to the board to be deleted. When making your assessment, be certain to also account for the swap space that will be lost. ■ If the size of available memory (e.g., 1.5 gigabytes) is larger than the size of deleted memory (e.g., 1 gigabytes), the total size of available memory will be 0.5 gigabytes after deleting the system board. ■ If the size of available memory (e.g., 1.
For example, when a kernel memory board with memory mirror mode enabled is deleted or moved, kernel memory is moved from the kernel memory board to another system board. Kernel memory is moved normally even if memory mirror mode is disabled for the move-destination system board. However, this operation results in lowered reliability of memory on the new kernel memory board.
2.5.8 Deletion of Board with DVD Drive To delete the system board to which the server’s DVD drive is connected, execute the following steps: 1. Stop the vold(1M) daemon by disabling the volfs service. # /usr/sbin/svcadm disable volfs 2. Execute the DR operation. 3. Restart the vold(1M) daemon by enabling the volfs service. # /usr/sbin/svcadm enable volfs For details, see the vold(1M) Solaris man page.
CHAPTER 3 DR User Interface This chapter describes the user interfaces for DR. 3.1 How To Use the DR User Interface XSCF provides two user interfaces for DR: the command line interface by XSCF shell, and the browser-based user interface by XSCF Web. This section describes the main XSCF shell commands used for DR. For other related commands, see Section 3.2, “Command Reference” on page 3-25. For XSCF Web, see Section 3.2, “Command Reference” on page 3-25 and Section 3.3, “XSCF Web” on page 3-27.
TABLE 3-2 DR Operation Commands Command name Function setdcl Update and edit the DCL. setupfru Set the division type and memory mirror mode for a PSB. addboard Add a system board to a domain. deleteboard Delete a system board from a domain. moveboard Move a system board between domains. The sections below describe the DR display and DR operation commands in detail and show examples.
TABLE 3-3 Options of the showdcl Command Option Description -a Displays configuration information and status of all domains. -v Displays detailed domain configuration information. -h Displays usage information. -d domain_id Displays information about the specified domain, where domain_id is the domain number, possibly 0 to 23, depending on server model. Only one domain ID can be specified. -l lsb Displays information about the specified logical system board (LSB), numbered 00 to 15.
TABLE 3-4 Items of Domain Information to be Displayed (Continued) Display items Description Float Setting of floating board option Cfg-policy true Enabled: Board is designated as a Floating board. false Disabled: Board is not designated as Floating board. Setting of configuration policy FRU Degradation in units of components. XSB Degradation in units of XSB. System Stopping of domain without degradation. The table below lists the items displayed by the showdcl(8) command.
3.1.2 Displaying Domain Status The showdomainstatus(8) command lists the domains in the system and their status. This command displays the same domain status information as the showdcl(8) command. Use the showdomainstatus(8) command to check domain status before and after a DR operation.
TABLE 3-6 Items of Domain Information to be Displayed (Continued) Display items Description Status Domain status Powered Off Domain power is off. Initialization Phase POST processing or OpenBoot PROM initialization is in progress. OpenBoot Executing Completed Initialization by OpenBoot PROM is completed.
The following examples show the format and options of the showboards(8) command. showboards showboards showboards showboards TABLE 3-7 [-v] -a [-c sp] [-v] -d domain _id [-c sp] [-v] xsb -h Options of the showboards Command Option Description -v Displays detailed information about the system board. -a Displays information about all mounted system boards. -h Displays the usage information.
TABLE 3-8 Display items Description Assignment Status of assignment to domain configuration Pwr Conn Conf 3-8 Items of System Board Information to be Displayed (Continued) Unavailable The system board cannot be used. The system board may be unrecognizable because it is not mounted or it is faulty, the domain or system board may not have been configured, or the system board may be assigned to another domain.
TABLE 3-8 Items of System Board Information to be Displayed (Continued) Display items Description Test Diagnostic status of system board Fault COD Unmount The system board is not mounted or cannot be recognized because it is faulty. Unknown The system board is not being diagnosed. Testing testing. Passed The system board was tested, and passed. Failed A system board error was tested, and failed. The system board cannot be used or has been degraded.
■ Example 2: Display of detailed information on all system boards XSCF> showboards -v -a XSB R DID(LSB) Assignment Pwr Conn Conf Test Fault COD -------------------------------------------------------------------------00-0 00(00) Assigned y y y Passed Normal n 00-1 00(01) Assigned y n n Passed Degraded n 00-2 * SP Available y n n Unknown Normal n 00-3 01(15) Assigned y y y Passed Normal n ■ Example 3: Display of information on the system board in the system board pool in domain #0 XSCF> showboards -c sp -
Note – The showdevices(8) command only reports information about a running domain. TABLE 3-9 Options of the showdevices Command Option Description -v Specifies that the command displays information about all devices. Information about not only the management target devices but also other devices is displayed.
TABLE 3-10 Domain Information Displayed by the showdevices command Display items Description CPU CPU information. Memory IO Devices DID Domain ID. XSB System board number. id CPU ID. state CPU status. speed CPU frequency (MHz). ecache CPU cache size (Megabyte: MB). usage Description of instance using resources. Memory information. DID Domain ID XSB System board number board mem Size of memory on system board (MB).
■ Example: Display of device information on XSB00-0 XSCF> CPU: ---DID 00 00 showdevices 00-0 XSB 00-0 00-0 id 0 1 state on-line on-line speed 2048 2048 ecache 4 4 board perm remaining DID XSB mem MB mem MB mem MB 00 00-0 8192 2048 base domain target deleted address mem MB XSB mem MB 0x000003c000000000 65536 Memory: ------- I/O Devices: ---------DID XSB 00 00-0 00 00-0 00 00-0 00 00-0 10.1.1.1 3.1.
TABLE 3-11 Options of the showfru Command Option Description -a Specifies that the command display all configuration information on devices of the type specified by devtype. -h Displays usage information. device Specifies a device type. Specify “sb” for DR. location Specifies a device name. Specifies a physical system board (PSB) number. Specify a decimal number from 00 to 15 for PSB.
3.1.6 Adding a System Board Use the addboard(8) command to add a system board to a domain or reserve the addition of a system board to a domain based on the DCL. The system board must already be registered in the target domain’s DCL. Use the showdcl(8) command to check whether a system board is registered in the DCL. To register a system board in the DCL, use the setdcl(8) command. Before executing the addboard(8) command, check the status of the DR-target domain and system board.
TABLE 3-13 Options of the addboard Command (Continued) Option Description -h Displays the usage information. -c configure Specifies that the command add a system board to the domain. If no other -c option is specified, -c configure is the default. -c assign Specifies that the command assign a system board to the domain. With this option specified, the command assigns the target system board to the domain.
Note – (Note 3) If a system board has been forcibly added to a domain by the addboard(8) command with the -f option specified, normal operation of all added hardware resources may be disabled. For this reason, you should avoid using the -f option for normal DR operations. After adding a system board by using the addboard(8) command with the -f option specified, be sure to check the status of the added system board and the devices on the system board. 3.1.
TABLE 3-14 3-18 Options of the deleteboard Command Option Description -q Specifies the suppression of output message display. The -y or -n option determines how output messages are automatically answered, whether or not the messages themselves are suppressed (with the -q option) or displayed. -y Specifies that a response of "yes" is made automatically to output messages.
Note – (Note 1) The time required for system board deletion processing depends on the amount of hardware resources mounted on the target system board. For this reason, much time may be required for the command to end its operation. If the system board contains kernel memory, the OS is suspended for a while. Note – (Note 2) If the DR processing executed by the deleteboard(8) command fails, the target system board cannot be restored to the previous status.
The following examples show the format and options of the moveboard(8) command. moveboard moveboard moveboard moveboard TABLE 3-15 3-20 [[-q] -{y|n}][-f][-v][-c configure] -d domain_id xsb[xsb...] [[-q] -{y|n}][-f][-v] -c assign -d domain_id xsb[xsb...] [[-q] -{y|n}][-f][-v] -c reserve -d domain_id xsb[xsb...] -h Options of the moveboard Command Option Description -q Specifies the suppression of output message display.
TABLE 3-15 Options of the moveboard Command (Continued) Option Description -c assign Specifies that the command delete a system board from the movesource domain and assign it to the move-destination domain. The assigned system board is added to the move-destination domain when the addboard(8) command is executed in the move-destination domain, the power of the move-destination domain is turned on, or the move-destination domain is rebooted.
Note – (Note 2) If the DR processing executed by the moveboard(8) command fails, the target system board cannot be restored to the previous status. If DR processing fails, identify the cause of failure based on the error message output by the moveboard(8) command and Solaris OS messages in the move-source and movedestination domains, and then take appropriate corrective action. Note that some errors require one of the domains to be rebooted.
Note – (Note 1) Before replacing a system board, you must know the division type of the replacement-target PSB and the configurations and operation status of all domains to which all XSBs on the PSB belong. If the division type of the replacement-target PSB is Quad-XSB and the XSBs on the replacement-target PSB belong to multiple domains, you must consult with all administrators of the relevant domains in advance to adequately adjust the method of replacing the system board.
3.1.10 Reserving a Domain Configuration Change Use the addboard(8), deleteboard(8), or moveboard(8) command to reserve a domain configuration change. A domain configuration change is reserved when a system board cannot be added, deleted, or moved immediately for operational reasons. The reserved addition, deletion, or move of the system board is executed when the power of the target domain is turned on or off, or the domain rebooted.
3.2 Command Reference This section lists the DR commands and other commands related to DR. For details of the commands, refer to SPARC Enterprise M4000/M5000/M8000/M9000 Servers XSCF Reference Manual. For the DR commands, see Section 3.1, “How To Use the DR User Interface” on page 3-1. Note – (Note 1) Use of each command is restricted to selected administrators only. To use each command, you must have appropriate administrator privileges.
TABLE 3-18 3-26 DR-related Commands Command name Function poweron Turns on the power of all domains or a specified domain. poweroff Turns off the power of all domains or a specified domain. setdscp Configures DSCP network. showdscp Displays the DSCP network configuration. addfru Installs a Field Replaceable Unit (FRU). deletefru Removes a Field Replaceable Unit (FRU). replacefru Replaces a Field Replaceable Unit (FRU).
3.3 XSCF Web XSCF Web lets you execute DR functions from a browser. XSCF Web is beyond the scope of this document. For details, refer to SPARC Enterprise M4000/M5000/M8000/M9000 Servers XSCF User’s Guide. 3.4 RCM Script Reconfiguration Coordination Manager (RCM) is a framework used to manage the dynamic disconnection of system components. RCM provides script functions that enable you to write your own scripts for dynamic reconfiguration.
3-28 SPARC Enterprise Mx000 Servers Dynamic Reconfiguration User’s Guide • September 2007
CHAPTER 4 Practical Examples of DR This chapter provides examples of DR operations, such as the addition, deletion, move, and replacement of system boards. Each example shows an operation procedure using the command line interface of the XSCF shell. Similar procedures can also be applied to DR operations using the browser-based interface of the XSCF Web.
4.1.1 Flow: Adding a System Board FIGURE 4-1 Flow: Adding a System Board Checking operation and selecting a DR operation - Operation status and configuration of a domain - Judgment of whether the DR operation can be performed DR operation possible Checking the domain status DR operation not Stop status possible, or domain of the configuration domain to be changed The domain is operating.
4.1.2 Flow: Deleting a System Board FIGURE 4-2 Flow: Deleting a System Board Checking operation and selecting a DR operation - Operation status and configuration of a domain - Judgment of whether the DR operation can be performed DR operation possible Checking the domain status DR operation not possible, domain DR operation or configuration not possible to be changed The domain is operating.
4.1.
4.1.
4.2 Example: Adding a System Board This section provides an example of the DR operation to add a system board to a domain. In the example, a procedure conforming to section 4.1.1, "Flow: Adding a System Board.", is used, and the system board shown in the figure is added by using the XSCF shell. FIGURE 4-5 Example: Adding a System Board Domain#0 XSB#00-0 Domain#0 XSB#01-0 Add XSB#00-0 XSB#01-0 1. Login to XSCF. 2. Check the status of the domain.
If you need to change the PSB configuration, use the setupfru(8) command. If the system board to be added is not registered in the DCL, register the system board in the DCL of the target domain by using the setdcl(8) command. XSCF> showboards -a XSB DID(LSB) Assignment Pwr Conn Conf Test Fault ---------------------------------------------------------------00-0 00(00) Assigned y y y Passed Normal 01-0 SP Available y n n Passed Normal 4. Add the new system board.
4.3 Example: Deleting a System Board This section provides an example of operation to delete a system board from a domain. In the example, a procedure conforming to Section 4.1.2, “Flow: Deleting a System Board” on page 4-3, is used, and the system board shown in the figure is deleted using the XSCF shell. FIGURE 4-6 Example: Deleting a System Board Domain#0 XSB#00-0 Domain#0 XSB#01-0 Delete XSB#00-0 XSB#01-0 1. Login to XSCF. 2. Check the status of the domain.
3. Check the status of the system board to be deleted. Execute the showboards(8) command to display system board information, and then check the status of the system board to be deleted. XSCF> showboards -a XSB DID(LSB) Assignment Pwr Conn Conf Test Fault ------------------------------------------------------------------00-0 00(00) Assigned y y y Passed Normal 01-0 00(01) Assigned y y y Passed Normal 4. Delete the system board.
4.4 Example: Moving a System Board This section provides an example of an operation to move a system board between domains. In the example, a procedure conforming to Section 4.1.3, “Flow: Moving a System Board” on page 4-4, is used, and the system board shown in the figure is moved using the XSCF shell. FIGURE 4-7 Example: Moving a System Board Domain#0 XSB#00-0 Domain#1 Domain#0 XSB#01-0 XSB#00-0 Move Domain#1 XSB#00-1 XSB#00-1 XSB#01-0 1. Login to XSCF. 2.
3. Check the status of the move-destination domain. Execute the showdcl(8) command to display domain information, and then check the operation status of the move-destination domain. Based on the operation status of the move-source and move-destination domains, determine whether to perform the DR operation or change the domain configuration. XSCF> showdcl -d 1 DID LSB XSB Status 01 Running 00 01-0 01 00-1 4. Check the status of the system board to be moved.
7. Check the status of the move-destination domain and moved system board. Execute the showdcl(8) command to check the operation status of the movedestination domain, and then execute the showboards(8) command to check the status of the moved system board. XSCF> showdcl -d 1 DID LSB XSB Status 01 Running 00 01-0 01 00-1 XSCF> showboards 00-1 XSB DID(LSB) Assignment Pwr Conn Conf Test Fault ------------------------------------------------------------------00-1 01(01) Assigned y y y Passed Normal 4.
4.5.1 Example: Replacing a Uni-XSB System Board FIGURE 4-8 Example: Replacing a Uni-XSB System Board Delete Domain#0 XSB#00-0 Faulty system board Replace XSB#01-0 New system board Add 1. Login to XSCF. 2. Check the status of the domain. Execute the showdcl(8) command to display domain information, and then check the operation status of the domain. Based on the operation status of the domain, determine whether to perform the DR operation or replace the system board after stopping the domain.
4. Delete the system board. Execute the deleteboard(8) command to delete the system board. XSCF> deleteboard -c disconnect 01-0 5. Check the status of the system board. Execute the showboards(8) command to display system board information, and then check the status of the system board. XSCF> showboards 01-0 XSB DID(LSB) Assignment Pwr Conn Conf Test Fault ----------------------------------------------------------------01-0 00(01) Assigned y n n Passed Normal 6. Physically replace the system board.
8. Check the status of the domain. Execute the showdcl(8) command to display domain information, and then check the operation status of the domain. Based on the operation status of the domain, determine whether to perform the DR operation or reboot the domains. XSCF> showdcl -d 0 DID LSB XSB Status 00 Running 00 00-0 01 01-0 9. Add the new system board to the domain. Execute the addboard(8) command to add the system board to the movedestination domain. XSCF> addboard -c configure -d 0 01-0 10.
4.5.2 Example: Replacing a Quad-XSB System Board FIGURE 4-9 Example: Replacing a Quad-XSB System Board Domain#0 Faulty Delete XSB#00-0 XSB#01-0 system board XSB#01-1 Replace Domain#1 XSB#01-2 Add XSB#01-3 New system board 1. Login to XSCF. 2. Check the configurations and status of all domains to which the relevant system boards belong.
XSCF> showdcl -a DID LSB XSB Status 00 Running 00 00-0 01 01-0 02 01-1 ------01 Running 00 01-2 01 01-3 3. Check the status of all related system boards. Execute the showboards(8) command to display system board information, and then check the status of all system boards related to the PSB to be replaced. The DR operation for replacement may not be possible if the board to be replaced does not support the DR delete operation.
6. Check the status of all related system boards. Execute the showboards(8) command to display system board information, and then check the status of all related system boards.
9. Check the status of all related domains. Execute the showdcl(8) command to display domain information, and then check the operation status of all related domains. Based on the operation status of the domain, determine whether to perform the DR operation or reboot the domains. XSCF> showdcl -a DID LSB XSB Status 00 Running 00 00-0 01 01-0 02 01-1 ------01 Powered Off 00 01-2 01 01-3 10. Add the new system board to the domain. Execute the addboard(8) command in the domain to add the new system board.
XSCF> showboards -a XSB DID(LSB) Assignment Pwr Conn Conf Test Fault -----------------------------------------------------------------00-0 00(00) Assigned y y y Passed Normal 01-0 00(01) Assigned y y y Passed Normal 01-1 00(02) Assigned y y y Passed Normal 01-2 01(00) Assigned y y y Passed Normal 01-3 01(01) Assigned y y y Passed Normal 4.6 Examples: Reserving Domain Configuration Changes This section provides examples of operations to reserve a change in domain configuration by DR.
2. Check the status of the system board to be added. Execute the showboards(8) command to display system board information, and then check the status of the system board to be added and confirm its registration in the DCL. If you need to change the PSB configuration, use the setupfru(8) command. If the system board is not registered in the DCL, register the system board in the DCL for the target domain by using the setdcl(8) command.
4.6.2 Example: Reserving a System Board Delete FIGURE 4-11 Example: Reserving a System Board Delete Domain#0 XSB#00-0 Domain#0 XSB#01-0 Delete XSB#00-0 XSB#01-0 1. Login to XSCF. 2. Check the status of the domain. Execute the showdcl(8) command to display domain information, and then check the operation status of the domain. Based on the operation status of the domain, determine whether to perform the DR operation or change the domain configuration.
5. Check the reserved status of the system board. Execute the showboards(8) command with the -v option specified to display system board information, and then confirm that deletion of the system board has been reserved. XSCF> showboards -v 01-0 XSB R DID(LSB) Assignment Pwr Conn Conf Test Fault COD -------------------------------------------------------------------------01-0 * 00(01) Assigned y y y Passed Normal n 6. Stop or reboot the domain.
3. Check the status of the move-destination domain. Execute the showdcl(8) command to display domain information, and then check the operation status of the move-destination domain. Based on the operation status of the move-source and move-destination domains, determine whether to perform the DR operation or change the domain configuration. XSCF> showdcl -d 0 DID LSB XSB Status 00 Running 00 00-0 01 00-1 02 01-0 4. Check the status of the system board to be moved.
8. Check the status of the move-destination domain and moved system board. Execute the showdcl(8) command to check the operation status of the movedestination domain, and then execute the showboards(8) command to check the status of the system board and confirm that addition of the system board has been reserved in the move-destination domain.
4-26 SPARC Enterprise Mx000 Servers Dynamic Reconfiguration User’s Guide • September 2007
APPENDIX A Message Meaning and Handling This appendix explains the meaning and handling of DR-related messages. A.1 Solaris OS Messages This section explains the console messages printed by the DR driver. The output for messages that do not have an output field is console. A.1.1 Transition Messages DR: PROM detach board X [Explanation] Detach system board X. OS configure dr@0:SBX::cpuY [Explanation] Configure CPU Y on system board X.
OS unconfigure dr@0:SBX::memory [Explanation] Unconfigure memory on system board X. OS unconfigure dr@0:SBX::pciY [Explanation] Unconfigure PCI Y on system board X.
[Explanation] Suspending device drivers DR: in-kernel unprobe board [Explanation] Unprobing the board. A.1.2 PANIC Messages URGENT_ERROR_TRAP is detected during FMA. [Explanation] A fatal HW error was encountered during copy-rename. [Remedy] Please contact customer service. Failed to remove CMP X LSB NN [Explanation] There may be inconsistency in the system. [Remedy] Please contact customer service.
[Explanation] Internal error during kernel migration [Remedy] Please contact customer service. CPU nn hang during Copy Rename [Explanation] A fatal HW error was encountered during copy-rename. [Remedy] Please contact customer service. A.1.3 Warning Messages # megabytes not available to kernel cage [Explanation] Lack of memory resource deleted. [Remedy] Detach the board, then attach it again. IKP: init failed [Explanation] The initial device tree walk to locate the nodes that are interesting to IKP fails.
[Remedy] Disable interrupt on cpu X with psradm -I and if this command fails again, respond in the manner directed by command message. dr_cancel_cpu: failed to online cpu X [Explanation] Failed to online CPU X. [Remedy] Repeat the action. If this error message appears again, please contact customer service. dr_cancel_cpu: failed to power-on cpu X [Explanation] Failed to power-on cpu X [Remedy] Repeat the action. If this error message appears again, please contact customer service.
[Explanation] There may be inconsistency in the system. [Remedy] Please contact customer service. dr_status: failed to copyout status for board # [Explanation] There may be inconsistency in the system. [Remedy] Please contact customer service. dr_status: unknown dev type (#) [Explanation] There may be inconsistency in the system. [Remedy] Please contact customer service. dr_dev2devset: invalid cpu unit# = # [Explanation] Invalid argument is passed to the driver or there may be inconsistency in the system.
dr_pre_release_cpu: thread(s) bound to cpu X [Explanation] The thread in the process is bound to the detached CPU X. [Remedy] Check if the process bound to the CPU exists by pbind(1M) command. If it exists, unbind from the CPU and repeat the action. dr_pre_release_mem: unexpected kphysm_del_release return value # [Explanation] There may be inconsistency in the system. [Remedy] Please contact customer service.
[Remedy] Please repeat the action. If the problem remains, please contact customer service. dr_release_mem_done: : error noted [Explanation] Error noted for a device during releasing memory. [Remedy] Please contact customer service. drmach_log_sysevent failed (rv #) for SBX [Explanation] There may be minor error in the system. [Remedy] Please contact customer service. unexpected kcage_range_add return value # [Explanation] There may be inconsistency in the system.
Cannot stop user thread: ... [Explanation] The DR driver cannot stop all the user processes in the list. [Remedy] Please contact customer service. [Output] Console and Standard Output Cannot setup memory node [Explanation] DR is unable to read the HW information for the memory device. [Remedy] Please contact customer service. Kernel Migration fails. 0xX [Explanation] Kernel data migration failed as a result of DR detach. [Remedy] Please contact customer service.
Invalid argument [Explanation] Invalid argument is passed to the driver or there may be inconsistency in the system. [Remedy] Repeat the action. If this error message appears again, please contact customer service. [Output] Console and Standard Output Invalid argument: ######## [Explanation] Invalid argument is passed to the driver. [Remedy] Repeat the action. If this error message appears again, please contact customer service.
[Explanation] There may be inconsistency in the system. [Remedy] Please contact customer service. [Output] Console and Standard Output Cannot read property value: device node XXXXXX property: name [Explanation] Fail to get the property from OBP. [Remedy] Please contact customer service. [Output] Console and Standard Output Cannot read property value: property: scf-cmd-reg [Explanation] There may be inconsistency in the system. [Remedy] Please contact customer service.
[Explanation] DR fails to allocate enough memory to perform copy rename. [Remedy] Retry and if the problem persists, contact customer service. Failed to off-line: dr@0:SBX::cpuY [Explanation] Failed to off-line CPU Y on board X. [Remedy] Repeat the action. If this error message appears again, please contact customer service. [Output] Console and Standard Output Failed to on-line: dr@0:SBX::cpuY [Explanation] Failed to online CPU Y on system board X. [Remedy] Online CPU with psradm -n.
[Explanation] Detected lack of memory resource. [Remedy] Check the size of memory, detach the board and attach again. If the problem still exists, please contact customer service. [Output] Console and Standard Output Internal error: dr.c # [Explanation] There may be inconsistency in the system. [Remedy] Please contact customer service. [Output] Console and Standard Output Internal error: dr_mem.c # [Explanation] There may be inconsistency in the system. [Remedy] Please contact customer service.
Memory operation refused: dr@0:SBX::memory [Explanation] The DR operation is refused. [Remedy] Respond in the manner directed by the other message. Memory operation cancelled: dr@0:SBX::memory [Explanation] The DR operation is canceled. [Remedy] Respond in the manner directed by the other message. No device(s) on board: dr@0:SBX [Explanation] There may be inconsistency in the system. [Remedy] Please contact customer service.
[Remedy] Repeat the action. If this error message appears again, please contact customer service. [Output] Console and Standard Output Insufficient memory: dr@0:SBX::cpuY [Explanation] Lack of memory resources detected. [Remedy] Check the size of available memory and detach the board. If the problem still exists, please contact customer service. [Output] Console and Standard Output Invalid argument: dr@0:SBX::cpuY [Explanation] There may be inconsistency in the system.
Operation already in progress: dr@0:SBX::cpuY [Explanation] The operation on cpu Y on system board X is in progress. [Remedy] Repeat the action. If the problem still exists, please contact customer service. [Output] Console and Standard Output dr_move_memory: failed to quiesce OS for copy-rename [Explanation] There is a task not suspended in the process. [Remedy] Repeat the action. If this error message appears again, please contact customer service.
[Output] Console and Standard Output Cannot setup resource map opl-fcodemem [Explanation] Resource memory mapping cannot be set up. [Remedy] Please contact customer service. opl_cfg failed to load, error= [Explanation] opl_cfg module failed to load. [Remedy] Please contact customer service. IKP: failed to read HWD header [Explanation] The header of the hardware descriptor could not be read. [Remedy] Please contact customer service.
[Explanation] A claim request with a nonzero hint came from the fcode interpreter. [Remedy] If DR failed after this message, please contact customer service. opl_claim_memory - unable to allocate contiguous memory [Explanation] Memory allocation failed for the fcode interpreter. [Remedy] If DR failed after this message, please contact customer service. opl_get_fcode: Unable to copy out fcode image [Explanation] Failed to copy out the fcode image to the efcode daemon.
[Explanation] The node was not destroyed. [Remedy] Please contact customer service. IKP: destroy chip (-) failed [Explanation] The node was not destroyed. [Remedy] Please contact customer service. dr_del_mlist_query: mlist=NULL [Explanation] The memory list to be deleted is NULL. This warning is also shown at memoryless board. [Remedy] Please ignore this message on memoryless boards. If DR failed after this message, please contact customer service.
[Remedy] Please contact customer service. I/O callback failed in post-attach [Explanation] I/O callback failed in post-attach [Remedy] Please contact customer service. Kernel Migration fails. 0x%x [Explanation] Internal error happened during kernel migration. [Remedy] Please contact customer service. Failed to add CMP%d on board %d [Explanation] CPU failed to power-on during DR attach. [Remedy] Please contact customer service.
[Remedy] Please contact customer service. Failed to remove CMP xx on board n [Explanation] Internal error during DR operation. [Remedy] Please contact customer service. scf_fmem_cancel() failed rv=0x [Explanation] Internal error during kernel migration. [Remedy] Please contact customer service. scf_fmem_start error [Explanation] SCF fails to start the FMEM operation. It is possible that there is HW error and there is no SCF path or the SP is down. [Remedy] Please contact customer service.
[Explanation] An unknown resource type was found in the resource list that is being freed while the board is unprobed. [Remedy] Please contact customer service. VM viability test failed: dr@0:SBX::memory [Explanation] There is not enough real memory to detach memory on system board X. [Remedy] Check the amount of available real memory, and repeat the action.If this error message appears again, please contact our customer service.
SCF error [Explanation] Internal error happened during kernel migration. [Remedy] Please contact customer service. A.2 Command Messages A.2.1 addboard XSB#XX-X will be assigned to DomainID X. Continue? [y|n]: [Explanation] Confirming whether DR operation is going to be executed or not. Input "y" to execute it and "n" to stop it. XSB#XX-Xwill be configured into DomainID X. Continue? [y|n]: [Explanation] Confirming whether DR operation is going to be executed or not.
XSB#XX-X is currently unavailable for DR. Try again later. [Explanation] The specified system board (XSB#XX-X) has already been executed by another operation. [Remedy] DR or power-off has been executing for another session. Try again after waiting for a while, with the confirmation of the XSB status. XSB#XX-X has not been registered in DCL. [Explanation] System board (XSB#XX-X) is not registered to DCL. [Remedy] Register DCL information by setdc(8). Another DR operation is in progress. Try again later.
[Remedy] Find out the cause of the DR failure referring monitoring message and errorlog. Confirm the patch applying status and the XCP version DR failed. Domain (DomainID X) cannot communicate via DSCP path. [Explanation] DR processing cannot communicate with the domain. The reasons are that domain is powered off, the DSCP setting is wrong or the error occurs at the DSCP path. [Remedy] Confirm the domain powered off, DSCP setting, DSCP error with monitoring message and errorlog.
An internal error has occurred. Please contact your system administrator. [Explanation] DR failed. There is a possibility that DR failed because of an internal error in XSCF. [Remedy] Find out the cause of the DR failure referring monitoring message and error log. Please also confirm the XCP version. Timeout detected during self-test of XSB#XX-X. [Explanation] Because the hardware diagnosis in DR did not complete, a timeout occurred. There is a possibility that a hardware error occurred.
[Explanation] The specified system board (XSB#XX-X) has already been executed by another operation. [Remedy] DR or power-off has been executing for another session. Try again after waiting for a while, with the confirmation of the XSB status. XSB#XX-X has not been registered to DCL. [Explanation] System board (XSB#XX-X) is not registered to DCL. [Remedy] Register DCL information by setdc(8). XSB#XX-X is the last LSB for DomainID X, and this domain is still running. Operation failed.
Invalid parameter. [Explanation] There is an error in the specified argument or operand. [Remedy] Confirm the specified argument or operand and execute the command once again. Permission denied. [Explanation] Do not have privilege. [Remedy] Confirm the user privilege and the command privilege. In the case of high-end servers, please also confirm whether command is executed by XSCF on standby side. A hardware error occurred. Please check the error log for details. [Explanation] Hardware error occurred.
[Explanation] Confirming whether DR operation is going to be executed or not. Input "y" to execute it and "n" to stop it. DR operation canceled by operator. [Explanation] DR operation canceled by operator. Domain (DomainID X) is not currently running. [Explanation] Destination domain #X was not active when "-c configure" was specified. [Remedy] Execute it by specifying "-c assign". XSB#XX-X cannot be moved due to System Board Pool. [Explanation] The XSB in the system board pool cannot be moved.
[Remedy] Power off the domain by specifying "-c reserve". XSB#XX-X detected timeout by DR self test. [Explanation] The timeout occurred during DR processing because the hardware diagnosis did not complete. There is something wrong with the hardware. [Remedy] Find out the cause of the DR failure referring monitoring message and errorlog. Replace the failure component. XSB#XX encountered a hardware error. See error log for details. [Explanation] An error occurred during hardware diagnosis.
[Remedy] Find out the cause of the DR failure referring monitoring message and console message. Try again after taking out cause. Invalid parameter. [Explanation] There is an error in the specified argument or operand. [Remedy] Confirm the specified argument or operand and execute the command once again. Permission denied. [Explanation] Do not have privilege. [Remedy] Confirm the user privilege and the command privilege.
[Explanation] Confirming whether DR operation is going to be executed or not. Input "y" to execute it and "n" to stop it. XSB#XX-Xwill be configured into DomainID X. Continue? [y|n]: [Explanation] Confirming whether DR operation is going to be executed or not. Input "y" to execute it and "n" to stop it. XSB#XX-X could not be configured into DomainID X due to operating system error. [Explanation] An error occurred in DR library of domain OS at configuration process.
Invalid parameter. [Explanation] There is an error in the specified argument or operand. [Remedy] Confirm the specified argument or operand and execute the command once again. Permission denied. [Explanation] Do not have privilege. [Remedy] Confirm the user privilege and the command privilege. In the case of high-end servers, please also confirm whether command is executed by XSCF on standby side. An internal error has occurred. Please contact your system administrator. [Explanation] DR failed.
The specified parameter is not supported in this model. [Explanation] Unsupported parameter in this server is specified. For this reason, the command was canceled. [Remedy] Confirm the specified parameter and the server model, and execute the command once again. Invalid parameter. [Explanation] There is an error in the specified argument or operand. [Remedy] Confirm the specified argument or operand and execute the command once again. Permission denied. [Explanation] Do not have privilege.
[Remedy] Confirm that the DSCP setting is correct, confirm that the dsc process is running fine on the domain.
A-36 SPARC Enterprise Mx000 Servers Dynamic Reconfiguration User’s Guide • September 2007
APPENDIX B Example: Confirm Swap Space Size This example shows one way to analyze the physical memory on a system board to determine whether the system has enough swap space to support deletion of a board. It explains how to collect and analyze information using the showdevice(8) command on the XSCF and the swap(1M) command on the Solaris OS. In this example, the system board to be deleted contains physical memory and a disk has been attached to it to provide swap space.
XSCF> CPU: ---DID 00 00 00 00 showdevices 00-0 XSB 00-0 00-0 00-0 00-0 id 40 41 40 41 state on-line on-line on-line on-line speed 2048 2048 2048 2048 ecache 4 4 4 4 Memory: ------DID 00 XSB 00-0 IO Devices: ---------DID XSB 00 00-0 board mem MB 2048 device sd0 perm mem MB 0 base address 0x0000000000000000 resource /dev/dsk/c0t0d0s1 domain mem MB 4096 target XSB deleted mem MB remaining mem MB usage swap area Notice in the Memory section that 2048 MB (2GB) of physical memory is on this bo
Glossary This glossary describes some of the terms used in this manual. Capacity on Demand (COD) CPU/Memory Board unit (CMU) CPU core CPU module Domain Component List (DCL) domain ID (DID) Domain-SP Communication Protocol (DSCP) eXtended system board (XSB) eXtended System Control Facility (XSCF) An option that provides additional CPU processing resources when needed. These additional CPUs are provided on COD CPU boards that are installed in the system.
eXtended System Control facility unit (XSCFU) field-replaceable unit (FRU) firmware Hardware Control Program (HCP) I/O unit (IOU) logical system board (LSB) motherboard unit (MBU) OpenBoot PROM physical system board (PSB) power-on self-test (POST) The XSCF board for this server which contains system administration function and operates with independent processor. A part that can be replaced by field engineers when servicing the system. Firmware is the software to control the system.
privileges Specific permissions granted to users. This system has platform administrator, platform operator, domain administrator, domain operator, domain manager, user administrator, audit administrator, audit operator and field engineer privileges during the XSCF program running. Quad-XSB One of the division types for a PSB to be configured. Quad-XSB is a name for when a PSB is logically divided into four parts. The division type can be changed by using the XSCF command setupfru(8).
Glossary-4 SPARC Enterprise Mx000 Servers Dynamic Reconfiguration User’s Guide • September 2007
Index A E Add, 1-3 addboard, 3-2, 3-15, 3-22 addfru, 3-26 addition, 1-6, 2-20, 2-27, 3-15, 4-2, 4-6 Assign, 1-3 eXtended System Board, 1-4 eXtended System Control Facility (XSCF), 1-7 B I Basic DR Terms, 1-3 I/O device, 2-9, 2-16, 2-27 Install, 1-4 Intimate Shared Memory, 2-8 IO board unit, 1-4 ISM, 2-8 C Capacity on Demand, 2-29 configuration policy, 2-13 Configure, 1-3 Copy-rename, 2-7 CPU, 2-4 D DCL, 1-3, 2-10 degradation, 2-13 Delete, 1-3 deleteboard, 3-2, 3-17, 3-22 deletefru, 3-26 deletion,
move, 1-6, 2-23, 3-19, 4-4, 4-10 moveboard, 3-2, 3-19 system board pool, 2-10 system board status, 2-18, 3-6 system configuration, 2-11 O omit-I/O, 2-15 omit-memory, 2-15 P Physical System Board, 1-4 poweroff, 3-26 poweron, 3-26 PSB, 1-4 Q U Unassign, 1-3 Unconfigure, 1-4 Uni-XSB, 1-5, 2-1, 2-10, 4-13 user memory board, 2-8 X XSB, 1-4 XSCF, 2-12, 2-13 XSCF Web, 3-27 Quad-XSB, 1-5, 2-1, 2-10, 4-16 R RCM Script, 3-27 real-time processes, 2-28 Register, 1-3 Release, 1-3 Remove, 1-4 Replace, 1-4 replace
Herausgegeben von / Published by Fujitsu Siemens Computers GmbH Bestell-Nr./ Order No.