Sun™ Enterprise 10000 DR Configuration Guide Sun Microsystems, Inc. 4150 Network Circle Santa Clara, CA 95054 U.S.A. 650-960-1300 Part No. 816-3630-10 May 2002, Revision A Send comments about this document to: docfeedback@sun.
Copyright 2002 Sun Microsystems, Inc., 4150 Network Circle, Santa Clara, CA 95054 U.S.A. All rights reserved. This product or document is distributed under licenses restricting its use, copying, distribution, and decompilation. No part of this product or document may be reproduced in any form by any means without prior written authorization of Sun and its licensors, if any. Third-party software, including font technology, is copyrighted and licensed from Sun suppliers.
Sun Enterprise 10000 SSP Attributions: This software is copyrighted by the Regents of the University of California, Sun Microsystems, Inc., and other parties. The following terms apply to all files associated with the software unless explicitly disclaimed in individual files.
Contents Preface vi Before You Read This Book vi How This Book Is Organized Using UNIX Commands vii Typographic Conventions Shell Prompts vi vii vii Related Documentation viii Accessing Sun Documentation Online Sun Welcomes Your Comments 1. Configuring DR DR Models viii ix 1 2 Enhancements in DR Model 3.0 Where to Execute DR Commands 3 3 Requirements for Multipathing in DR 3.
Overview of DR Configuration Tasks 6 ▼ To Enable the Kernel Cage 7 ▼ To Set Permanent Driver Parameters for Network Drivers 7 ▼ To Enable Device Suspension for the soc and pln Drivers 8 ▼ To Specify an Unsafe Driver List ▼ To Make an Unsupported Tape Device Detach-Safe Preparing for DR Detach Operations 8 9 9 Configuration Changes During DR Operations 10 Controlling Forcible Conditions that Affect System Quiescence ▼ To Manually Suspend a Suspend-Unsafe Device ▼ To Force a System Quie
Preface This guide describes the domain-side configuration of the Sun Enterprise 10000 server Dynamic Reconfiguration (DR) feature. For information about how to use these features, refer to the appropriate document listed in “Related Documentation” on page viii. Before You Read This Book This guide is intended for the Sun Enterprise 10000 system administrator who has a working knowledge of UNIX® systems, particularly those based on the Solaris™ operating environment.
Using UNIX Commands This document does not contain complete information on basic UNIX commands and procedures such as shutting down the system, booting the system, and configuring devices. See the Solaris software documentation that you received with your system for this information. Typographic Conventions Typeface or Symbol Meaning Examples AaBbCc123 The names of commands, files, and directories; on-screen computer output Edit your .login file. Use ls -a to list all files. % You have mail.
Related Documentation Application Title Part Number User Sun Enterprise 10000 Dynamic Reconfiguration User Guide 816-3627 Sun Enterprise 10000 SSP 3.5 User Guide 806-7613 System Administration Guide: IP Services 806-4075 Sun StorEdge Traffic Manager Software Installation and Configuration Guide 816-1420 Sun Enterprise 10000 InterDomain Networks User Guide 806-4131 Sun Enterprise 10000 Dynamic Reconfiguration Reference Manual 806-7617 Sun Enterprise 10000 SSP 3.
Sun Welcomes Your Comments Sun is interested in improving its documentation and welcomes your comments and suggestions. You can email your comments to Sun at: docfeedback@sun.com Please include the part number (816-3630-10) of your document in the subject line of your email.
CHAPTER 1 Configuring DR This chapter describes key DR functionality and also guides you through the tasks for configuring DR.
DR Models There are two models of DR available for the Sun Enterprise 10000 system. DR model 2.0 is sometimes referred to as “legacy DR,” and DR model 3.0 is referred to as “next generation DR.” The following table shows the different versions of the Solaris operating environment and the SSP software that are used with DR models 2.0 and 3.0: DR Model Solaris Software Versions SSP Software Versions 2.0 Solaris 5.1, 6, 7, and 8 3.3, 3.4, or 3.5 3.0 Solaris 8 10/01 and 02/02, Solaris 9 3.
Caution – Before you switch to DR 3.0 in a domain that is running the Solaris 8 10/01 operating environment, you must upgrade the SSP software to version 3.5 because previous versions of SSP do not support DR 3.0 operations. For more information about using DR 2.0, see the the Sun Enterprise 10000 Dynamic Reconfiguration (DR) User Guide (part number 806-7616-10). For more information about using DR 3.0, see the Sun Enterprise 10000 Dynamic Reconfiguration (DR) User Guide (part number 816-3627-10).
■ Verify that you have sufficient swap space for your domain. For details, see “Allocating Sufficient Domain Swap Space” on page 15. ■ Qualify any third-party device drivers, as described in “Qualifying Third-Party Device Drivers” on page 15. Device Prerequisites DR requires that drivers for devices on boards involved in DR detach operations be both: ■ Detach-safe or not currently loaded A detach-safe driver supports the device driver interface (DDI) function, DDI_DETACH.
Note – The drivers currently released by Sun Microsystems that are known to be suspend-safe are: st, sd, isp, esp, fas, sbus, pci, pci-pci, qfe, and hme (Sun FastEthernet™); nf (NPI-FDDI); qe (Quad Ethernet); le (Lance Ethernet); the SSA drivers (soc, pln, and ssd); and the Sun StorEdge A5000 drivers (sf, socal, and ses). For additional information about suspend-safe and detach-safe device drivers, contact your Sun service representative.
Overview of DR Configuration Tasks This section identifies the various configuration tasks that you must complete before running DR operations on Solaris 9 domains (which support only DR model 3.0). Note that it may not be necessary to perform all the tasks described in this section, depending on the types of devices on your system boards and the type of DR operation to be performed. After you configure DR or whenever you make changes to the DR configuration, you must reboot your domain.
For example, if you enabled the kernel cage, the following message is generated: NOTICE: DR Kernel Cage is Enabled ▼ To Enable the Kernel Cage A caged kernel confines the nonpageable memory to a minimal (most often one) number of systems boards. By default the kernel cage is disabled, preventing DR detach operations. If you plan to perform DR detach operations, you must enable the kernel cage by using the system(4) variable kernel_cage_enable, as explained in the following procedure.
● If you want to set the driver configuration parameters permanently, set the parameters in the /etc/system file or the driver.conf file for a specific driver. ▼ To Enable Device Suspension for the soc and pln Drivers If your system boards contain soc and pln devices, perform the following steps to make those drivers suspend-safe. 1.
▼ To Make an Unsupported Tape Device DetachSafe For the Solaris 9 operating environment, tape devices that are natively supported by Sun Microsystems are suspend-safe and detach-safe. For details, refer to the st(7D) man page for a list of natively-supported drives. If a system board to be detached contains a natively-supported tape device, you can safely detach the board without suspending the device.
3. If you want to detach a board that hosts Sun StorEdge A3000 controllers, make those controllers idle or take them offline manually using the rm6 or rdacutil programs. The Sun StorEdge A3000 (formerly known as the RSM Array 2000) has dual controller paths with automatic load balancing and automatic failover functionality. 4.
A failure to quiesce due to open suspend-unsafe devices is known as a forcible condition. You have the option to retry the operation, or you can try to force the quiescence. The conditions that cause processes not to suspend are generally temporary in nature. You can retry the operation until the quiescence succeeds. When you try to force the quiescence, you give the operating environment permission to continue with the quiescence even if forcible conditions are still present.
c. Disconnect the cables to the device. For example, if a device that allows asynchronous unsolicited input is open, you can disconnect its cables prior to quiescing the operating environment, preventing traffic from arriving at the device and the device from accessing the domain centerplane. You can reconnect the cables after the operating environment resumes. d. Unload the device driver by using the modunload(1M) command. 2. Perform the DR operation again. 3. Do the following: a.
If no target board is found for a copy rename operation, the deleteboard(1M) and moveboard(1M) commands display the following error messages, respectively: deleteboard: unconfigure SB2: No available memory target: dr@0:SB2::memory moveboard: unconfigure SB2: No available memory target: dr@0:SB2::memory Processors The boot processor is responsible for maintaining the netcon BBSRAM buffer.
Remote DR Communication In Solaris 9 domains, the domain configuration server, dcs(1M), controls DR operations. ▼ To Troubleshoot a Connection Failure During a Solaris 9 (DR Model 3.0) Operation 1. Check the domain. dcs(1M) must be configured in the /etc/inetd.conf file of the domain. The following lines must be present in the file: sun-dr stream tcp sun-dr stream tcp6 wait root /usr/lib/dcs dcs wait root /usr/lib/dcs dcs 2. If the dcs daemon is configured in /etc/inetd.
Index A addboard(1M), 3 C cfgadm(1M), 3 commands addboard(1M), 3 cfgadm(1M), 3 deleteboard(1M), 3 moveboard(1M), 3 rcfgadm(1M), 3 showdevices(1M), 3 configuring swap space I/O controllers across boards, 5 connection, loss of, 14 D dcs(1M), 14 deleteboard(1M), 3 detach closing devices, 9 pageable memory and swap space during detach, 5 RSM 2000 and detach, 10 Sun StorEdge A3000 and detach, 10 swap space and detach, 5 detach and processors, 13 detach-safe tape devices, 9 devices detach-safe, 4 detach-unsaf
N network drivers, suspend-unsafe, 11 tape devices, detach-safe, 9 tape devices, suspend-unsafe, 9 timeout, RPC, 14 P pageable memory and swap space, during detach, 5 processors and detach, 13 Q quiescence, 4 failure reasons, 10 forcible conditions, 11 suspend-unsafe devices, 10 R rcfgadm(1M), 3 RPC timeout, 14 RSM 2000 and detach, 10 S showdevices(1M), 3 ST_UNLOADABLE flag and tape devices, 9 Sun StorEdge A3000 and detach, 10 suspend failures and forciable conditions, 11 suspend, reasons it may fail,