IBM Certification Study Guide AIX HACMP David Thiessen, Achim Rehor, Reinhard Zettler International Technical Support Organization http://www.redbooks.ibm.
SG24-5131-00 International Technical Support Organization IBM Certification Study Guide AIX HACMP May 1999
Take Note! Before using this information and the product it supports, be sure to read the general information in Appendix A, “Special Notices” on page 205. First Edition (May 1999) This edition applies to HACMP for AIX and HACMP/Enhanced Scalability (HACMP/ES), Program Number 5765-D28, for use with the AIX Operating System Version 4.3.2 and later. Comments may be addressed to: IBM Corporation, International Technical Support Organization Dept.
Contents Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .ix Tables. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .xi Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii The Team That Wrote This Redbook . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiv Comments Welcome . . . . . . . . . . . . . . . . . . . . .
Chapter 3. Cluster Hardware and Software Preparation . . . . . . . . . . . 51 3.1 Cluster Node Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 3.1.1 Adapter Slot Placement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 3.1.2 Rootvg Mirroring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 3.1.3 AIX Prerequisite LPPs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 3.1.4 AIX Parameter Settings . . . . . . .
5.1.3 Event Notification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122 5.1.4 Event Recovery and Retry . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122 5.1.5 Notes on Customizing Event Processing . . . . . . . . . . . . . . . . . 123 5.1.6 Event Emulator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 5.2 Error Notification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 5.
.1.1 The clstat Command. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152 8.1.2 Monitoring Clusters using HAView . . . . . . . . . . . . . . . . . . . . . . 152 8.1.3 Cluster Log Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153 8.2 Starting and Stopping HACMP on a Node or a Client . . . . . . . . . . . . 154 8.2.1 HACMP Daemons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155 8.2.2 Starting Cluster Services on a Node . . . . . . . . .
.3 VSDs - RVSDs . . . . . . . . . . . . . . . . . . 9.3.1 Virtual Shared Disk (VSDs) . . . . 9.3.2 Recoverable Virtual Shared Disk 9.4 SP Switch as an HACMP Network . . . 9.4.1 Switch Basics Within HACMP . . . 9.4.2 Eprimary Management . . . . . . . . 9.4.3 Switch Failures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. .. .. .. .. .. .. . . . . . . . .. .. .. .. .. .. .. . . . . . . . .. .. .. .. .. .. .. .. .. .. .. .. .. ..
viii IBM Certification Study Guide AIX HACMP
Figures 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. Basic SSA Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 Hot-Standby Configuration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 Mutual Takeover Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 Third-Party Takeover Configuration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 Single-Network Setup . . . . .
x IBM Certification Study Guide AIX HACMP
Tables 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. AIX Version 4 HACMP Installation and Implementation . . . . . . . . . . . . . . . 4 AIX Version 4 HACMP System Administration . . . . . . . . . . . . . . . . . . . . . . 5 Hardware Requirements for the Different HACMP Versions . . . . . . . . . . . . 8 Number of Adapter Slots in Each Model . . . . . . . . . . . . . . . . . . . . . . . . . . 10 Number of Available Serial Ports in Each Model. . . . . . . . . . . . . . . . . . .
xii IBM Certification Study Guide AIX HACMP
Preface The AIX and RS/6000 Certifications offered through the Professional Certification Program from IBM are designed to validate the skills required of technical professionals who work in the powerful and often complex environments of AIX and RS/6000. A complete set of professional certifications is available. It includes: • IBM Certified AIX User • IBM Certified Specialist - RS/6000 Solution Sales • IBM Certified Specialist - AIX V4.3 System Administration • IBM Certified Specialist - AIX V4.
• AIX parameters that are affected by an HACMP installation, and their correct settings • The cluster and resource configuration process, including how to choose the best resource configuration for a customer requirement • Customization of the standard HACMP facilities to satisfy special customer requirements • Diagnosis and troubleshooting knowledge and skills This redbook helps AIX professionals seeking a comprehensive and task-oriented guide for developing the knowledge and skills required for the certif
POWERparallel Systems area, known as the SP1 at that time. In 1997 he began working on HACMP as the Service Groups for HACMP and RS/6000 SP merged into one. He holds a diploma in Computer Science from the University of Frankfurt in Germany. This is his first redbook. Reinhard Zettler is an AIX Software Engineer in Munich, Germany. He has two years of experience working with AIX and HACMP. He has worked at IBM for two years. He holds a degree in Telecommunication Technology. This is his first redbook.
xvi IBM Certification Study Guide AIX HACMP
Chapter 1. Certification Overview This chapter provides an overview of the skill requirements for obtaining an IBM Certified Specialist - AIX HACMP certification. The following chapters are designed to provide a comprehensive review of specific topics that are essential for obtaining the certification. 1.
1.2 Certification Exam Objectives The following objectives were used as a basis for what is required when the certification exam was developed. Some of these topics have been regrouped to provide better organization when discussed in this publication. Section 1 - Preinstallation The following items should be considered as part of the preinstallation plan: • Conduct a Planning Session. • Set customer expectations at the beginning of the planning session. • Gather customer's availability requirements.
• Create an application server. • Set up Event Notification. • Set up event notification and pre/post event scripts. • Set up error notification. • Post Configuration Activities. • Configure a client notification and ARP update. • Implement a test plan. • Create a snapshot. • Create a customization document. • Perform Testing and Troubleshooting. • Troubleshoot a failed IPAT failover. • Troubleshoot failed shared volume groups. • Troubleshoot a failed network configuration.
1.3 Certification Education Courses Courses and publications are offered to help you prepare for the certification tests. These courses are recommended, but not required, before taking a certification test. At the printing of this guide, the following courses are available. For a current list, please visit the following Web site: http://www.ibm.com/certify Table 1.
The following table outlines information about the next course. Table 2. AIX Version 4 HACMP System Administration Course Number Q1150 (USA); AU50 (Worldwide) Course Duration Five days Course Abstract This course teaches the student the skills required to administer an HACMP cluster on an ongoing basis after it is installed. The skills that are developed in this course include: • Integrating the cluster with existing network services (DNS, NIS, etc.
6 IBM Certification Study Guide AIX HACMP
Chapter 2. Cluster Planning The area of cluster planning is a large one. Not only does it include planning for the types of hardware (CPUs, networks, disks) to be used in the cluster, but it also includes other aspects. These include resource planning, that is, planning the desired behavior of the cluster in failure situations. Resource planning must take into account application loads and characteristics, as well as priorities.
RISC System/6000 models as nodes in an HACMP 4.1 for AIX, HACMP 4.2 for AIX, or HACMP 4.3 for AIX cluster. Table 3. Hardware Requirements for the Different HACMP Versions HACMP Version 4.1 4.2 4.3 4.2/ES 4.3/ES 7009 Mod. CXX yes yes yes no yes1 7011 Mod. 2XX yes yes yes no yes1 7012 Mod. 3XX and GXX yes yes yes no yes1 7013 Mod. 5XX and JXX yes yes yes no yes1 7015 Mod. 9XX and RXX yes yes yes no yes1 7017 Mod. S7X yes yes yes no yes1 7024 Mod.
Much of the decision centers around the following areas: • Processor capacity • Application requirements • Anticipated growth requirements • I/O slot requirements These paradigms are certainly not new ones, and are also important considerations when choosing a processor for a single-system environment. However, when designing a cluster, you must carefully consider the requirements of the cluster as a total entity.
Your slot configuration must also allow for the disk I/O adapters you need to support the cluster’s shared disk (volume group) configuration. If you intend to use disk mirroring for shared volume groups, which is strongly recommended, then you will need to use slots for additional disk I/O adapters, providing I/O adapter redundancy across separate buses. The following table tells you the number of additional adapters you can put into the different RS/6000 models.
2.2 Cluster Networks HACMP differentiates between two major types of networks: TCP/IP networks and non-TCP/IP networks. HACMP utilizes both of them for exchanging heartbeats. HACMP uses these heartbeats to diagnose failures in the cluster. Non-TCP/IP networks are used to distinguish an actual hardware failure from the failure of the TCP/IP software.
• FDDI • SP Switch • SLIP • SOCC • Token-Ring As an independent, layered component of AIX, the HACMP for AIX software works with most TCP/IP-based networks. HACMP for AIX has been tested with standard Ethernet interfaces (en*) but not with IEEE 802.3 Ethernet interfaces (et*), where * reflects the interface number.
Network types also differentiate themselves in the maximum distance they allow between adapters, and in the maximum number of adapters allowed on a physical network. • Ethernet supports 10 and 100 Mbps currently, and supports hardware address swapping. Alternate hardware addresses should be in the form xxxxxxxxxxyy, where xxxxxxxxxx is replaced with the first five pairs of digits of the original burned-in MAC address and yy can be chosen freely.
• SP Switch is a high-speed packet switching network, running on the RS/6000 SP system only. It runs bidirectionally up to 80 MBps, which adds up to 160 MBps of capacity per adapter. This is node-to-node communication and can be done in parallel between every pair of nodes inside an SP. The SP Switch network has to be defined as a private Network, and ARP must be enabled. This network is restricted to one adapter per node, thus, it has to be considered as a Single Point Of Failure.
2.2.2.2 Special Considerations As for TCP/IP networks, there are a number of restrictions on non-TCP/IP networks. These are explained for the three different types in more detail below. Serial (RS232) A serial (RS232) network needs at least one available serial port per cluster node. In case of a cluster consisting of more than two nodes, a ring of nodes is established through serial connections, which requires two serial ports per node.
2 3 a PCI Multiport Async Card is required in an S7X model, no native ports only one serial port available for customer use, i.e. HACMP In case the number of native serial ports doesn’t match your HACMP cluster configuration needs, you can extend it by adding an eight-port asynchronous adapter, thus reducing the number of available MCA slots, or the corresponding PCI Multiport Async Card for PCI Machines, like the S7X model.
SSA subsystems are built up from loops of adapters and disks. A simple example is shown in Figure 1. SSA Architecture 20 MBps 20MBps l l l l l HOST 20 MBps 20 MBp s High performance 80 MB/s interface Loop architecture with up to 127 nodes per loop Up to 25 m (82 ft) between SSA devices with copper cables Up to 2.4 km (1.5 mi) between SSA devices with optical extender Spatial reuse (multiple simultaneous transmissions) Figure 1.
• 7133 Serial Storage Architecture (SSA) Disk Subsystem Models 010, 500, 020, 600, D40 and T40. The 7133 models 010 and 500 were the first SSA products announced in 1995 with the revolutionary new Serial Storage Architecture. Some IBM customers still use the Models 010 and 500, but these have been replaced by 7133 Model 020, and 7133 Model 600 respectively. More recently, in November 1998, the models D40 and T40 were announced. All 7133 Models have redundant power and cooling, which is hot-swappable.
Item Specification Supported RAID level 5 Supported adapters all Hot-swappable disk Yes (and hot-swappable, redundant power and cooling) 2.3.1.1 Disk Capacities Table 8 lists the different SSA disks, and provides an overview of their characteristics. Table 8. SSA Disks Name Capacities (GB) Buffer size (KB) Maximum Transfer rate (MBps) Starfire 1100 1.1 0 20 Starfire 2200 2.2 0 20 Starfire 4320 4.5 512 20 Scorpion 4500 4.5 512 80 Scorpion 9100 9.1 512 160 Sailfin 9100 9.
Feature Code Adapter Label Bus Adapter Description Number of Adapters per Loop Hardware Raid Types 6219 4-M MCA Enhanced RAID-5 81 5 1 See 2.3.1.3, “Rules for SSA Loops” on page 20 for more information. The following rules apply to SSA Adapters: • You cannot have more than four adapters in a single system. • The MCA SSA 4-Port RAID Adapter (FC 6217) and PCI SSA 4-Port RAID Adapter (FC 6218) are not useful for HACMP, because only one can be in a loop.
• A maximum of 48 devices can be connected in a particular SSA loop. • Only one pair of adapter connectors can be connected in a particular SSA loop. • Member disk drives of an array can be on either SSA loop.
2.3.1.4 RAID vs. Non-RAID RAID Technology RAID is an acronym for Redundant Array of Independent Disks. Disk arrays are groups of disk drives that work together to achieve higher data-transfer and I/O rates than those provided by single large drives. Arrays can also provide data redundancy so that no data is lost if a single drive (physical disk) in the array should fail. Depending on the RAID level, data is either mirrored or striped. The following gives you more information about the different RAID levels.
RAID Levels 2 and 3 RAID 2 and RAID 3 are parallel process array mechanisms, where all drives in the array operate in unison. Similar to data striping, information to be written to disk is split into chunks (a fixed amount of data), and each chunk is written out to the same physical position on separate disks (in parallel). When a read occurs, simultaneous requests for the data can be sent to each disk.
As with RAID 3, in the event of disk failure, the information can be rebuilt from the remaining drives. RAID level 5 array also uses parity information, though it is still important to make regular backups of the data in the array. RAID level 5 stripes data across all of the drives in the array, one segment at a time (a segment can contain multiple blocks). In an array with n drives, a stripe consists of data segments written to n-1 of the drives and a parity segment written to the nth drive.
• Array member drives and spares must be on same loop (cannot span A and B loops) on the adapter. • You cannot boot (ipl) from a RAID. 2.3.1.5 Advantages Because SSA allows SCSI-2 mapping, all functions associated with initiators, targets, and logical units are translatable. Therefore, SSA can use the same command descriptor blocks, status codes, command queuing, and all other aspects of current SCSI systems. The effect of this is to make the type of disk subsystem transparent to the application.
2.3.2 SCSI Disks After the announcement of the 7133 SSA Disk Subsystems, the SCSI Disk subsystems became less common in HACMP clusters. However, the 7135 RAIDiant Array (Model 110 and 210) and other SCSI Subsystems are still in use at many customer sites. We will not describe other SCSI Subsystems such as 9334 External SCSI Disk Storage. See the appropriate documentation if you need information about these SCSI Subsystems.
• Enhanced SCSI-2 Differential Fast/Wide Adapter/A (MCA, FC: 2412, Adapter Label: 4-C); not usable with 7135-110 • SCSI-2 Fast/Wide Differential Adapter (PCI, FC: 6209, Adapter Label: 4-B) • DE Ultra SCSI Adapter (PCI, FC: 6207, Adapter Label: 4-L); not usable with 7135-110 2.3.2.
withdraw the 7135 RAIDiant Systems from marketing because it is equally possible to configure RAID on the SSA Subsystems. 2.4 Resource Planning HACMP provides a highly available environment by identifying a set of cluster-wide resources essential to uninterrupted processing, and then defining relationships among nodes that ensure these resources are available to client processes.
• Cascading • Rotating • Concurrent Each of these types describes a different set of relationships between nodes in the cluster, and a different set of behaviors upon nodes entering and leaving the cluster. Cascading Resource Groups: All nodes in a cascading resource group are assigned priorities for that resource group. These nodes are said to be part of that group's resource chain. In a cascading resource group, the set of resources cascades up or down to the highest priority node active in the cluster.
reintegration, a node remains as a standby and does not take back any of the resources that it had initially served. Concurrent Resource Groups: A concurrent resource group may be shared simultaneously by multiple nodes. The resources that can be part of a concurrent resource group are limited to volume groups with raw logical volumes, raw disks, and application servers. When a node fails, there is no takeover involved for concurrent resources.
Figure 2. Hot-Standby Configuration In this configuration, there is one cascading resource group consisting of the four disks, hdisk1 to hdisk4, and their constituent volume groups and file systems. Node 1 has a priority of 1 for this resource group while node 2 has a priority of 2. During normal operations, node 1 provides all critical services to end users. Node 2 may be idle or may be providing non-critical services, and hence is referred to as a hot-standby node.
the cluster becomes a standby node. You must choose a rotating standby configuration if you do not want a break in service during reintegration. Since takeover nodes continue providing services until they have to leave the cluster, you should configure your cluster with nodes of equal power. While more expensive in terms of CPU hardware, a rotating standby configuration gives you better availability and performance than a hot-standby configuration.
When a failed node reintegrates into the cluster, it takes back the resource group for which it has the highest priority. Therefore, even in this configuration, there is a break in service during reintegration. Of course, if you look at it from the point of view of performance, this is the best thing to do, since you have one node doing the work of two when any one of the nodes is down. Third-Party Takeover Configuration Figure 4 illustrates a three node cluster in a third-party takeover configuration.
Here the resource groups are the same as the ones in the mutual takeover configuration. Also, similar to the previous configuration, nodes 1 and 2 each have priorities of 1 for one of the resource groups, A or B. The only thing different in this configuration is that there is a third node which has a priority of 2 for both the resource groups. During normal operations, node 3 is either idle or is providing non-critical services.
• Design the network topology • Define a network mask for your site • Define IP addresses (adapter identifiers) for each node’s service and standby adapters. • Define a boot address for each service adapter that can be taken over, if you are using IP address takeover or rotating resources. • Define an alternate hardware address for each service adapter that can have its IP address taken over, if you are using hardware address swapping. 2.4.3.
Dual Network A dual-network setup has two separate networks for communication. Nodes are connected to two networks, and each node has two service adapters available to clients. If one network fails, the remaining network can still function, connecting nodes and providing resource access to clients. In some recovery situations, a node connected to two networks may route network packets from one network to another. In normal cluster activity, however, each network is separate—both logically and physically.
The following diagram shows a cluster consisting of two nodes and a client. A single public network connects the nodes and the client, and the nodes are linked point-to-point by a private high-speed SOCC connection that provides an alternate path for cluster and lock traffic should the public network fail. Figure 7. A Point-to-Point Connection 2.4.3.2 Networks Networks in an HACMP cluster are identified by name and attribute.
SLIP are considered public networks. Note that a SLIP line, however, does not provide client access. Private A private network provides communication between nodes only; it typically does not allow client access. An SOCC line or an ATM network are also private networks; however, an ATM network does allow client connections and may contain standby adapters. If an SP node is used as a client, the SP Switch network, although private, can allow client access.
until it assumes the shared IP address. Consequently, Clinfo makes known the boot address for this adapter. In an HACMP for AIX environment on the RS/6000 SP, the SP Ethernet adapters can be configured as service adapters but should not be configured for IP address takeover. For the SP switch network, service addresses used for IP address takeover are ifconfig alias addresses used on the css0 network. Standby Adapter A standby adapter backs up a service adapter.
service label (address) instead of the boot label. If the node should fail, a takeover node acquires the failed node’s service address on its standby adapter, thus making the failure transparent to clients using that specific service address. During the reintegration of the failed node, which comes up on its boot address, the takeover node will release the service address it acquired from the failed node.
If you do not use Hardware Address Takeover, the ARP cache of clients can be updated by adding the clients’ IP addresses to the PING_CLIENT_LIST variable in the /usr/sbin/cluster/etc/clinfo.rc file. 2.4.4 NFS Exports and NFS Mounts There are two items concerning NFS when doing the configuration of a Resource Group: Filesystems to Export File systems listed here will be NFS exported, so they can be mounted by NFS client systems or other nodes in the cluster.
application on the takeover node when a fallover occurs. For more information about creating application server resources, see the HACMP for AIX, Version 4.3: Installation Guide , SC23-4278. 2.5.1 Performance Requirements In order to plan your application’s needs, you must have a thorough understanding of it. One part of that is to have The Application Planning Worksheets, found in Appendix A of the HACMP for AIX Planning Guide , SC23-4277, filled out.
Note Application start and stop scripts have to be available on the primary as well as the takeover node. They are not transferred during synchronization; so, the administrator of a cluster has to ensure that they are found in the same path location, with the same permissions and in the same state, i.e. changes have to be transferred manually. 2.5.
2.6 Customization Planning The Cluster Manager’s ability to recognize a specific series of events and subevents permits a very flexible customization scheme. The HACMP for AIX software provides an event customization facility that allows you to tailor cluster event processing to your site. 2.6.1 Event Customization As part of the planning process, you need to decide whether to customize event processing.
event to inform system administrators that traffic may have to be rerouted. Afterwards, you can use a network_up notification event to inform system administrators that traffic can again be serviced through the restored network. 2.6.1.3 Predictive Event Error Correction You can specify a command that attempts to recover from an event script failure. If the recovery command succeeds and the retry count for the event script is greater than zero, the event script is rerun.
2.6.2.1 Single Point-of-Failure Hardware Component Recovery As described in 2.2.1.2, “Special Network Considerations” on page 12, the HPS Switch network is one resource that has to be considered as a single point of failure. Since a node can support only one switch adapter, its failure will disable the switch network for this node. It is strongly recommended to promote a failure like this into a node failure, if the switch network is critical to your operations.
The above example screen will add a Notification Method to the ODM, so that upon appearance of the HPS_FAULT9_ER entry in the error log, the error notification daemon will trigger the execution of the /usr/sbin/cluster/utilities/clstop -grsy command, which shuts HACMP down gracefully with takeover. In this way, the switch failure is acted upon as a node failure. 2.6.2.
2.7 User ID Planning The following sections describe various aspects of User ID Planning. 2.7.1 Cluster User and Group IDs One of the basic tasks any system administrator must perform is setting up user accounts and groups. All users require accounts to gain access to the system. Every user account must belong to a group. Groups provide an additional level of security and allow system administrators to manipulate a group of users as a single entity.
2.7.2 Cluster Passwords While user and group management is very much facilitated with C-SPOC, the password information still has to be distributed by some other means. If the system is not configured to use NIS or DCE, the system administrator still has to distribute the password information, meaning that found in the /etc/security/password file, to all cluster nodes. As before, this can be done through rdist or rcp.
2.7.3.3 NFS-Mounted Home Directories on Shared Volumes So, a combined approach is used in most cases. In order to make home directories a highly available resource, they have to be part of a resource group and placed on a shared volume. That way, all cluster nodes can access them in case they need to.
Chapter 3. Cluster Hardware and Software Preparation This chapter covers the steps that are required to prepare the RS/6000 hardware and AIX software for the installation of HACMP and the configuration of the cluster. This includes configuring adapters for TCP/IP, setting up shared volume groups, and mirroring and editing AIX configuration files. 3.1 Cluster Node Setup The following sections describe important details of cluster node setup. 3.1.
mirroring rootvg in order to avoid the impact of the failover time involved in a node failure. In terms of maximizing availability, this technique is just as valid for increasing the availability of a cluster as it is for increasing single-system availability. The following procedure contains information that will enable you to mirror the root volume group (rootvg), using the advanced functions of the Logical Volume Manager (LVM). It contains the steps required to: • Mirror all the file systems in rootvg.
mirrored. If the dump devices are NOT the paging device, that dump logical volume will not be mirrored. 3.1.2.1 Procedure The following steps assume the user has rootvg contained on hdisk0 and is attempting to mirror the rootvg to a new disk: hdisk1. 1. Extend rootvg to hdisk1 by executing the following: extendvg rootvg hdisk1 2. Disable QUORUM, by executing the following: chvg -Qn rootvg 3.
“-m” option. You should consult documentation on the usage of the “-m” option for mklvcopy. 4. Synchronize the newly created mirrors with the following command: syncvg -v rootvg 5. Bosboot to initialize all boot records and devices by executing the following command: bosboot -a -d /dev/hdisk? where hdisk? is the first hdisk listed under the “PV” heading after the command lslv -l hd5 has executed. 6.
3.1.2.2 Necessary APAR Fixes Table 11. Necessary APAR Fixes AIX Version APARs needed 4.1 IX56564 IX61184 IX60521 4.2 IX62417 IX68483 IX70884 IX72058 4.3 IX72550 To determine if either fix is installed on a machine, execute the following: instfix -i -k 3.1.3 AIX Prerequisite LPPs In order to install HACMP and HACMP/ES the AIX setup must be in a proper state. The following table gives you the prerequisite AIX levels for the different HACMP versions: Table 12.
• nv6000.database.obj 4.1.0.0 • nv6000.Features.obj 4.1.2.0 • nv6000.client.obj 4.1.0.0 and for HAView 4.3 • xlC.rte 3.1.4.0 • nv6000.base.obj 4.1.2.0 • nv6000.database.obj 4.1.2.0 • nv6000.Features.obj 4.1.2.0 • nv6000.client.obj 4.1.2.0 3.1.4 AIX Parameter Settings This section discusses several general tasks necessary to ensure that your HACMP for AIX cluster environment works as planned. Consider or check the following issues to ensure that AIX works as expected in an HACMP cluster.
and low-water marks. If a process tries to write to a file at the high-water mark, it must wait until enough I/O operations have finished to make the low-water mark. Use the smit chgsys fastpath to set high- and low-water marks on the Change/Show Characteristics of the Operating System screen. By default, AIX is installed with high- and low-water marks set to zero, which disables I/O pacing.
3.1.4.3 Editing the /etc/hosts File and Nameserver Configuration Make sure all nodes can resolve all cluster addresses. See the chapter on planning TCP/IP networks (the section Using HACMP with NIS and DNS) in the HACMP for AIX, Version 4.3: Planning Guide, SC23-4277 for more information on name serving and HACMP. Edit the /etc/hosts file (and the /etc/resolv.conf file, if using the nameserver configuration) on each node in the cluster to make sure the IP addresses of all clustered interfaces are listed.
#! /bin/sh # This script checks for a ypbind and a cron process. If both # exist and cron was started before ypbind, cron is killed so # it will respawn and know about any new users that are found # in the passwd file managed as an NIS map. echo "Entering $0 at ‘date‘" >> /tmp/refr_cron.
3.2 Network Connection and Testing The following sections describe important aspects of network connection and testing. 3.2.1 TCP/IP Networks Since there are several types of TCP/IP Networks available within HACMP, there are several different characteristics and some restrictions on them. Characteristics like maximum distance between nodes have to be considered. You don’t want to put two cluster nodes running a mission-critical application in the same room for example. 3.2.1.
. Figure 9. Connecting Networks to a Hub 3.2.1.
To comply with these rules, pay careful attention to the IP addresses you assign to standby adapters. Standby adapters must be on a separate subnet from the service adapters, even though they are on the same physical network. Placing standby adapters on a different subnet from the service adapter allows HACMP for AIX to determine which adapter TCP/IP will use to send a packet to a network.
• Scan the /tmp/hacmp.out file to confirm that the /etc/rc.net script has run successfully. Look for a zero exit status. • If IP address takeover is enabled, confirm that the /etc/rc.net script has run and that the service adapter is on its service address and not on its boot address. • Use the lssrc -g tcpip command to make sure that the inetd daemon is running. • Use the lssrc -g portmap command to make sure that the portmapper daemon is running.
TMSSA Target-mode SSA is only supported with the SSA Multi-Initiator RAID Adapters (Feature #6215 and #6219), Microcode Level 1801 or later. You need at least HACMP Version 4.2.2 with APAR IX75718. 3.2.2.2 Configuring RS232 Use the smit tty fastpath to create a tty device on the nodes. On the resulting panel, you can add an RS232 tty by selecting a native serial port, or a port on an asynchronous adapter. Make sure that the Enable Login field is set to disable.
3.2.2.4 Configuring Target Mode SSA The node number on each system needs to be changed from the default of zero to a number. All systems on the SSA loop must have a unique node number. To change the node number use the following command: chdev -l ssar -a node_number=# To show the system’s node number use the following command: lsattr -El ssar Having the node numbers set to non-zero values enables the target mode devices to be configured.
cat /etc/environment > /dev/tmssay.im on the corresponding node for writing. x and y correspond to the appropriate opposite nodenumber. You should see the first command hanging until the second command is issued, and then showing its output. Target Mode SCSI: After configuration of Target Mode SCSI, you can check the functionality of the connection by entering the command: cat < /dev/tmscsix.tm on one node for reading from that device and: cat /etc/environment > /dev/tmscsiy.
For more information regarding adapters and cabling rules see 2.3.
Adapter Definitions By issuing the following command, you can check the correct adapter configuration. In order to work correctly, the adapter must be in the “Available” state: #lsdev -C | grep ssa ssa0 ssar Available 00-07 Defined SSA Enhanced Adapter SSA Adapter Router The third column in the adapter device line shows the location of the adapter. Disk Definitions SSA disk drives are represented in AIX as SSA logical disks (hdisk0, hdisk1,...,hdiskN) and SSA physical disks (pdisk0, pdisk1,...,pdiskN).
#lsdev -Cc disk| grep SSA hdisk3 hdisk4 hdisk5 hdisk6 hdisk7 hdisk8 Available Available Available Available Available Available 00-07-L 00-07-L 00-07-L 00-07-L 00-07-L 00-07-L SSA SSA SSA SSA SSA SSA Logical Logical Logical Logical Logical Logical Disk Disk Disk Disk Disk Disk Drive Drive Drive Drive Drive Drive SSA physical disks: • Are configured as pdisk0, pdisk1,...,pdiskN. • Have errors logged against them in the system error log. • Support a character special file (/dev/pdisk0, /dev/pdisk1,...
Configuration Verification This option enables you to display the relationships between physical (pdisk) and logical (hdisk) disks. Format Disk This option enables you to format SSA disk drives. Certify Disk This option enables you to test whether data on an SSA disk drive can be read correctly. Display/Download... This option enables you to display the microcode level of the SSA disk drives and to download new microcode to individual or all SSA disk drives connected to the system.
Note You must ensure that: • You do not attempt to perform this adapter microcode download concurrently on systems that are in the same SSA loop. This may cause a portion of the loop to be isolated and could prevent access to these disks from elsewhere in the loop. • You do not run advanced diagnostics while downloads are in progress.
18.To confirm that the upgrade was a success, type lscfg -vl pdiskX where X is 0,1... for all SSA disks. Check the ROS Level line to see that each disk has the appropriate microcode level (for the correct microcode level see the above mentioned web-site). 3.3.1.
3.3.2.1 Cabling The following sections describe important information about cabling. SCSI Adapters A overview of SCSI adapters that can be used on a shared SCSI bus is given in 2.3.2.3, “Supported SCSI Adapters” on page 26. For the necessary adapter changes, see 3.3.2.3, “Adapter SCSI ID and Termination change” on page 77. RAID Enclosures The 7135 RAIDiant Array can hold a maximum of 30 single-ended disks in two units (one base and one expansion).
FC: 2902 or 9202 (2.4m), PN: 67G1260 - OR FC: 2905 or 9205 (4.5m), PN: 67G1261 - OR FC: 2912 or 9212 (12m), PN: 67G1262 - OR FC: 2914 or 9214 (14m), PN: 67G1263 - OR FC: 2918 or 9218 (18m), PN: 67G1264 • Terminator (T) Included in FC 2422 (Y-Cable), PN: 52G7350 • Cable Interposer (I) FC: 2919, PN: 61G8323 One of these is required for each connection between an SCSI-2 Differential Y-Cable and a Differential SCSI Cable going to the 7135 unit, as shown in Figure 10.
FC: 2426 (0.94m), PN: 52G4234 • 16-Bit SCSI-2 Differential System-to-System Cable FC: 2424 (0.6m), PN: 52G4291 - OR FC: 2425 (2.5m), PN: 52G4233 This cable is used only if there are more than two nodes attached to the same shared bus. • 16-Bit Differential SCSI Cable (RAID Cable) FC: 2901 or 9201 (0.6m), PN: 67G1259 - OR FC: 2902 or 9202 (2.4m), PN: 67G1260 - OR FC: 2905 or 9205 (4.
6 bit) #2416 (166 (16-bit) #2424 #2426 #2426 6-bit) 6 (16-bit ) T T #2416 (16-bit) T T Maximum total cable length: 25m 76 #2416 (16-b IBM Certification Study Guide AIX HACMP
7135-110 #2902 #2426 #2902 Controller 1 #2902 RAID cable Controller 2 #2902 RAID cable 7135-110 Controller 1 #2902 #2426 Controller 2 #2416 (16-bit) #2902 #2416 (16-bit) #2416 (16-bit) #2416 (16-bit) #2424 #2426 #2426 #2416 (16-bit) #2416 (16-bit ) T T #2416 (16-bit) #2416 (16-bit) T T Maximum total cable length: 25m Figure 11. 7135-110 RAIDiant Arrays Connected on Two Shared 16-Bit SCSI Buses 3.3.2.
SCSI-2 Differential Fast/Wide Adapter/A and Enhanced SCSI-2 Differential Fast/Wide Adapter/A) are shown in Figure 12 and Figure 13 respectively. Termination Resistor Blocks P/N 43G0176 4-2 Figure 12. Termination on the SCSI-2 Differential Controller Internal 16-bit SE Internal 8-bit SE Termination Resistor Blocks P/N 56G7315 4-6 Figure 13.
The ID of an SCSI adapter, by default, is 7. Since each device on an SCSI bus must have a unique ID, the ID of at least one of the adapters on a shared SCSI bus has to be changed. The procedure to change the ID of an SCSI-2 Differential Controller is: 1. At the command prompt, enter smit chgscsi. 2. Select the adapter whose ID you want to change from the list presented to you. SCSI Adapter Move cursor to desired item and press Enter.
Change / Show Characteristics of a SCSI Adapter Type or select values in entry fields. Press Enter AFTER making all desired changes.
Change/Show Characteristics of a SCSI Adapter SCSI adapter Description Status Location Internal SCSI ID External SCSI ID WIDE bus enabled ... Apply change to DATABASE only ascsi1 Wide SCSI I/O Control> Available 00-06 7 +# [6] +# yes + yes The command line version of this is: # chdev -l ascsi1 -a id=6 -P As in the case of the SCSI-2 Differential Controller, a system reboot is required to bring the change into effect.
3.4.1 Creating Shared VGs The following sections contain information about creating non-concurrent VGs and VGs for concurrent access. 3.4.1.1 Creating Non-Concurrent VGs This section covers how to create a shared volume group on the source node using the SMIT interface. Use the smit mkvg fastpath to create a shared volume group. Use the default field values unless your site has other requirements, or unless you are specifically instructed otherwise here. Table 13.
Creating a Concurrent Access Volume Group on Serial Disk Subsystems To use a concurrent access volume group, defined on a serial disk subsystem such as an IBM 7133 disk subsystem, you must create it as a concurrent-capable volume group. A concurrent-capable volume group can be activated (varied on) in either non-concurrent mode or concurrent access mode. To define logical volumes on a concurrent-capable volume group, it must be varied on in non-concurrent mode.
Use the smit mkvg fastpath to create a shared volume group. Use the default field values unless your site has other requirements, or unless you are specifically instructed otherwise. Table 15. smit mkvg Options (Concurrent, RAID) Options Description VOLUME GROUP name The name of the shared volume group should be unique within the cluster. Activate volume AUTOMATICALLY at restart? group system Set to no so that the volume group can be activated as appropriate by the cluster event scripts.
the journaled file system log (jfslog) is a logical volume that requires a unique name in the cluster. To make sure that logical volumes have unique names, rename the logical volume associated with the file system and the corresponding jfslog logical volume. Use a naming scheme that indicates the logical volume is associated with a certain file system. For example, lvsharefs could name a logical volume for the /sharefs file system. 1.
That is, you enter this command for each disk. In the resulting display, locate the line for the logical volume for which you just added copies. For copies placed on separate disks, the numbers in the logical partitions column and the physical partitions column should be equal. Otherwise, the copies were placed on the same disk and the mirrored copies will not protect against disk failure. Testing a File System To run a consistency check on each file system’s information: 1. Enter: fsck /filesystem_name 2.
The TaskGuide uses a graphical interface to guide you through the steps of adding nodes to an existing volume group. For more information on the TaskGuide, see 3.4.6, “Alternate Method - TaskGuide” on page 90. Importing the volume group onto the destination nodes synchronizes the ODM definition of the volume group on each node on which it is imported. You can use the smit importvg fastpath to import the volume group. Table 17.
Options Description A QUORUM of disks required to keep the volume group online? This field is site-dependent. See 3.4.5, “Quorum” on page 88 for a discussion of quorum in an HACMP cluster. 3.4.4.4 Varying Off the Volume Group on the Destination Nodes Use the varyoffvg command to deactivate the shared volume group so that it can be imported onto another destination node or activated as appropriate by the cluster event scripts. Enter: varyoffvg volume_group_name. 3.4.
command succeeds. If exactly half the copies are available, as with two of four, quorum is not achieved and the varyonvg command fails. 3.4.5.2 Quorum after Vary On If a write to a physical volume fails, the VGSAs on the other physical volumes within the volume group are updated to indicate that one physical volume has failed. As long as more than half of all VGDAs and VGSAs can be written, quorum is maintained and the volume group remains varied on.
Forcing a Varyon A volume group with quorum disabled and one or more physical volumes unavailable can be “forced” to vary on by using the -f flag with the varyonvg command. Forcing a varyon with missing disk resources can cause unpredictable results, including a reducevg of the physical volume from the volume group. Forcing a varyon should be an overt (manual) action and should only be performed with a complete understanding of the risks involved.
conflict with the cluster’s configuration. Online help panels give additional information to aid in each step. 3.4.6.1 TaskGuide Requirements Before starting the TaskGuide, make sure: • You have a configured HACMP cluster in place. • You are on a graphics capable terminal. 3.4.6.2 Starting the TaskGuide You can start the TaskGuide from the command line by typing: /usr/sbin/cluster/tguides/bin/cl_ccvg or you can use the SMIT interface as follows: 1. Type smit hacmp. 2.
92 IBM Certification Study Guide AIX HACMP
Chapter 4. HACMP Installation and Cluster Definition This chapter describes issues concerning the actual installation of HACMP Version 4.3 and the definition of a cluster and its resources. It concentrates on the HACMP part of the installation, so, we will assume AIX is already at the 4.3.2 level. Please refer to the AIX Version 4.3: Migration Guide , SG24-5116, for details on installation or migration to that level.
cluster.base.server.utils HACMP Base Server Utilities • cluster.cspoc This component includes all of the commands and environment for the C-SPOC utility, the Cluster-Single Point Of Control feature. These routines are responsible for centralized administration of the cluster. There is no restriction on the node from which you run the C-SPOC utility commands, so it should also be installed on all the server nodes. It consists of the following: cluster.cspoc.rte cluster.cspoc.cmds cluster.cspoc.
• cluster.vsm The Visual Systems Management Fileset contains Icons and bitmaps for the graphical Management of HACMP Resources, as well as the xhacmpm command: cluster.vsm HACMP X11 Dependent • cluster.haview This fileset contains the files for including HACMP cluster views into a TME 10 Netview Environment. It is installed on a Netview network management machine, and not on a cluster node: cluster.haview HACMP HAView • cluster.man.en_US.haview.
This fileset contains the Application Heart Beat Daemon, Oracle Parallel Server is an application that makes use of it: cluster.hc.rte Application Heart Beat Daemon The installation of CRM requires the following software: bos.rte.lvm.usr.4.3.2.0 AIX Run-time Executable Install Server Nodes From whatever medium you are going to use, install the needed filesets on each node. Refer to Chapter 8 of the HACMP for AIX, Version 4.3: Installation Guide, SC23-4278 for details.
HACMP software to HACMP for AIX, Version 4.3. The comments on upgrading the Operating System are not included. If you are already running AIX 4.3, see the special note at the end of this section. Note Although your objective in performing a migration installation is to keep the cluster operational and to preserve essential configuration information, do not run your cluster with mixed versions of the HACMP for AIX software for an extended period of time. 4.1.2.1 Upgrading from Version 4.1.0 through 4.2.
Install HACMP 4.3 for AIX on Node A 5. After upgrading AIX and verifying that the disks are correctly configured, install the HACMP 4.3 for AIX software on Node A. For a short description of the filesets, please refer to 4.1.1, “First Time Installs” on page 93 or to Chapter 8 of the HACMP for AIX, Version 4.3: Installation Guide, SC23-4278. 6. The installation process automatically runs the cl_convert program. It removes the current HACMP objects from /etc/objrepos and saves them to HACMP.old.
file on Node A using the following command: /usr/sbin/cluster/utilities/cllsif -x >> /.rhosts This command will append information to the /.rhosts file instead of overwriting it. Then, you can ftp this file to the other nodes as necessary. 12.Verify the cluster topology on all nodes using the clverify utility. 13.Check that custom event scripts are properly installed. 14.Synchronize the node configuration and the cluster topology from Node A to all nodes (this step is optional). 15.
2. If you wish to save your cluster configuration, see the chapter Saving and Restoring Cluster Configurations in the HACMP for AIX, Version 4.3: Administration Guide, SC23-4279. 3. Commit your current HACMP for AIX software on all nodes. 4. Shut down one node (gracefully with takeover) using the smit clstop fastpath. For this example, shut down Node A. Node B will take over Node A’s resources and make them available to clients.
• The network modules You define the cluster topology by entering information about each component into HACMP-specific ODM classes. You enter the HACMP ODM data by using the HACMP SMIT interface or the VSM utility xhacmpm. The xhacmpm utility is an X Windows tool for creating cluster configurations using icons to represent cluster components. For more information about the xhacmpm utility, see the administrative facilities chapter of the HACMP for AIX, Version 4.3: Concepts and Facilities, SC23-4276.
Note The node names are logically sorted in their ascii order within HACMP in order to decide which nodes are considered to be neighbors for heartbeat purposes. In order to build a logical ring, a node always talks to its up- and downstream neighbor in their node name’s ascii order. The uppermost and the lowest node are also considered neighbors.
Network Name Enter an ASCII text string that identifies the network. The network name can include alphabetic and numeric characters and underscores. Use no more than 31 characters. The network name is arbitrary, but must be used consistently for adapters on the same physical network. If several adapters share the same physical network, make sure you use the same network name for each of these adapters. Network Attribute Indicate whether the network is public, private, or serial.
Adapter Identifier Enter the IP address in dotted decimal format or a device file name. IP address information is required for non-serial network adapters only if the node’s address cannot be obtained from the domain name server or the local /etc/hosts file (using the adapter IP label given). You must enter device filenames for serial network adapters. RS232 serial adapters must have the device filename /dev/ttyN. Target mode SCSI serial adapters must have the device file name /dev/tmscsiN.
Note When IPAT is configured, the run level of the IP-related entries (e. g. rctcpip, rcnfs...) of the /etc/inittab are changed to “a”. This has the result that these services are not started at boot time, but with HACMP. Adding or Changing Adapters after the Initial Configuration If you want to change the information about an adapter after the initial configuration, use the Change/Show an Adapter screen. See the chapter on changing the cluster topology in the HACMP for AIX, Version 4.
• SLIP • SP Switch • ATM It is highly unlikely that you will add or remove a network module. For information about changing a characteristic of a Network Module, such as the failure detection rate, see the chapter on changing the cluster topology in the HACMP for AIX, Version 4.3: Administration Guide, SC23-4279. Changing the network module allows the user to influence the rate of heartbeats being sent and received by a Network Module, thereby changing the sensitivity of the detection of a network failure.
configuration. If the cluster manager is active on some other cluster nodes but not on the local node, the synchronization operation is aborted. Before attempting to synchronize a cluster configuration, ensure that all nodes are powered on, that the HACMP software is installed, and that the /etc/hosts and /.rhosts files on all nodes include all HACMP boot and service IP labels. The /.rhosts file may not be required if you are running HACMP on the SP system.
4.3 Defining Resources The HACMP for AIX software provides a highly available environment by identifying a set of cluster-wide resources essential to uninterrupted processing, and then by defining relationships among nodes that ensure these resources are available to client processes. Resources include the following hardware and software: • Disks • Volume groups • File systems • Network addresses • Application servers In the HACMP for AIX software, you define each resource as part of a resource group.
4.3.1.1 Configuring Resources for Resource Groups Once you have defined resource groups, you further configure them by assigning cluster resources to one resource group or another. You can configure resource groups even if a node is powered down. However, SMIT cannot list possible shared resources for the node (making configuration errors likely). Note You cannot configure a resource group until you have completed the information on the Add a Resource Group screen.
Service IP Label If IP address takeover is being used, list the IP label to be moved when this resource group is taken over. Press F4 to see a list of valid IP labels. These include addresses which rotate or may be taken over. File Systems Consistency Check Identify the method for checking consistency of file systems, fsck (default) or logredo (for fast recovery). File Systems Recovery Method Identify the recovery method for the file systems, parallel (for fast recovery) or sequential (default).
as the path locations for start and stop scripts for the application. These scripts have to be in the same location on every service node. Just as for pre- and post-events, these scripts can be adapted to specific nodes. They don’t need to be equal in content. The system administrator has to ensure, however, that they are in the same location, use the same name, and are executable for the root user. 4.3.1.
4.4.2 Initial Startup At this point in time, the cluster is not yet started. So the cluster manager has to be started first. To check whether the cluster manager is up, you can either look for the process with the ps command: ps -ef | grep clstr or look for the status of the cluster group subsystems: lssrc -g cluster or look for the status of the network interfaces.
For cascading resource groups the failed node is going to reaquire its resources, once it is up and running again. So, you have to restart HACMP on it through smitty clstart and check again for the logfile, as well as the clusters status. Further and more intensive debugging issues are covered in Chapter 7, “Cluster Troubleshooting” on page 143. 4.
Essentially, a snapshot saves all the ODM classes HACMP has generated during its configuration. It does not save user customized scripts, such as start or stop scripts for an application server. However, the location and names of these scripts are in an HACMP ODM class, and are therefore saved. It is very helpful to put all the customized data in one defined place, in order to make saving these customizations easier.
HACMP Installation and Cluster Definition 115
116 IBM Certification Study Guide AIX HACMP
Chapter 5. Cluster Customization Within an HACMP for AIX cluster, there are several things that are customizable. The following paragraphs explain the customizing features for events, error notification, network modules and topology services. 5.1 Event Customization An HACMP for AIX cluster environment acts upon a state change with a set of predefined cluster events (see 5.1.1, “Predefined Cluster Events” on page 117).
acquire_service_addr (If configured for IP address takeover.) Configures boot addresses to the corresponding service address, and starts TCP/IP servers and network daemons by running the telinit -a command. acquire_takeover_addr The script checks to see if a configured standby address exists, then swaps the standby address with the takeover address. get_disk_vg_fs Acquires disk, volume group, and file system resources.
event occurs only after a node_up_remote event has successfully completed. Sequence of node_down Events node_down This event occurs when a node intentionally leaves the cluster or fails. Depending on whether the exiting node is local or remote, this event initiates either the node_down_local or node_down_remote event, which in turn initiates a series of subevents. node_down_local Processes the following events: stop_server Stops application servers.
node_down_local_completeInstructs the Cluster Manager to exit when the local node has left the cluster. This event occurs only after a node_down_local event has successfully completed. node_down_remote_completeStarts takeover application servers. This event runs only after a node_down_remote event has successfully completed. start_server 5.1.1.2 Network Events network_down Starts application servers. This event occurs when the Cluster Manager determines a network has failed.
no actions since appropriate actions depend on the local network configuration. 5.1.1.3 Network Adapter Events swap_adapter This event occurs when the service adapter on a node fails. The swap_adapter event exchanges or swaps the IP addresses of the service and a standby adapter on the same HACMP network and then reconstructs the routing table. swap_adapter_complete This event occurs only after a swap_adapter event has successfully completed.
reconfig_resource_completeThis event indicates that a cluster resource dynamic reconfiguration has completed. 5.1.2 Pre- and Post-Event Processing To tailor event processing to your environment, specify commands or user-defined scripts that should execute before and/or after a specific event is generated by the Cluster Manager.
For example, a file system cannot be unmounted, because of a process running on it. Then, you might want to kill that process first, before unmounting the file system, in order to get the event script done. Now, since the event script didn’t succeed in its first run, the Retry feature enables HACMP for AIX to retry it until it finally succeeds, or the retry count is reached. 5.1.
Each time an error is logged in the system error log, the error notification daemon determines if the error log entry matches the selection criteria. If it does, an executable is run. This executable, called a notify method , can range from a simple command to a complex program. For example, the notify method might be a mail message to the system administrator or a command to shut down the cluster.
The failure rate of networks varies, depending on their characteristics. For example, for an Ethernet, the normal failure detection rate is two keepalives per second; fast is about four per second; slow is about one per second. For an HPS network, because no network traffic is allowed when a node joins the cluster, normal failure detection is 30 seconds; fast is 10 seconds; slow is 60 seconds.
To prevent problems with NFS file systems in an HACMP cluster, make sure that each shared volume group has the same major number on all nodes. The lvlstmajor command lists the free major numbers on a node. Use this command on each node to find a major number that is free on all cluster nodes, then, record that number in the Major Number field on the Shared Volume Group/File System (Non-Concurrent Access) worksheet in Appendix A, Planning Worksheets, of the HACMP for AIX, Version 4.
Figure 14. NFS Cross Mounts When Node A fails, Node B uses the cl_nfskill utility to close open files in Node A:/afs, unmounts it, mounts it locally, and re-exports it to waiting clients. After takeover, Node B has: /bfs /bfs /afs /afs locally mounted nfs-exported locally mounted nfs-exported Ensure that the shared volume groups have the same major number on the server nodes. This allows the clients to re-establish the NFS-mount transparently after the takeover.
• Ensure that node name and the service adapter label are the same on each node in the cluster or • Alias the node name to the service adapter label in the /etc/hosts file. 5.4.5 Cross Mounted NFS File Systems and the Network Lock Manager If an NFS client application uses the Network Lock Manager, there are additional considerations to ensure a successful failover. Consider the following scenario: Node A has a file system mounted locally and exported for use by clients.
######## Add for NFS Lock Removal (start) ######## ######## Add for NFS Lock Removal (finish) ######## ############################################################################### # # Name: cl_deactivate_nfs # # Given a list of nfs-mounted filesystems, we try and unmount -f # any that are currently mounted. # # Arguments: list of filesystems.
fi /bin/rm -f /etc/sm.bak/$host /bin/rm -f /etc/sm/$host /bin/rm -f /etc/state fi ######## Add for NFS Lock Removal (finish) ######## # Send a SIGKILL to all processes having open file # descriptors within this logical volume to allow # the unmount to succeed..
Chapter 6. Cluster Testing Before you start to test the HACMP configuration, you need to guarantee that your cluster nodes are in a stable state. Check the state of the: • Devices • System parameters • Processes • Network adapters • LVM • Cluster • Other items such as SP Switch, printers, and SNA configuration 6.1 Node Verification Here is a series of suggested actions to test the state of a node before including HACMP in the testing. 6.1.1 Device State • Run diag -a in order to clean up the VPD.
6.1.2 System Parameters • Type date on all nodes to check that all the nodes in the cluster are running with their clocks on the same time. • Ensure that the number of user licenses has been correctly set (lslicense). • Check high water mark and other system settings ( smitty chgsys). • Type sysdumpdev -l and sysdumpdev -e to ensure that the dump space is correctly set and that the primary dump device ( lslv hd7) is large enough to accomodate a dump.
• Check that all interfaces communicate ( ping or ping -R ). • List the arp table entries with arp -a. • Check the status of the TCP/IP daemons (lssrc -g tcpip). • Ensure that there are no bad entries in the /etc/hosts file, especially at the bottom of the file. • Verify that, if DNS is in use, the DNS servers are correctly defined ( more /etc/resolv.conf). • Check the status of NIS by typing ps -ef | grep ypbind and lssrc -g yp.
• Verify the cluster configuration by running /usr/sbin/cluster/diag/clconfig -v ’-tr’. • To show cluster configuration, run: /usr/sbin/cluster/utilities/cllscf. • To show the clstrmgr version, type: snmpinfo -m dump -o /usr/sbin/cluster/hacmp.defs clstrmgr. 6.2 Simulate Errors The following paragraphs will give you hints on how you can simulate different hardware and software errors in order to verify your HACMP configuration.
• Use ifconfig to swap the service address back to the original service interface back ( ifconfig en1 down). This will cause the service IP address to failover back to the service adapter on NodeF. 6.2.1.2 Ethernet or Token Ring Adapter or Cable Failure Perform the following steps in the event of an Ethernet or Token Ring adapter or cable failure: • Check, by way of the verification commands, that all the Nodes in the cluster are up and running. • Optional: Prune the error log on NodeF (errclear 0).
• Generate the switch error in the error log which is being monitored by HACMP Error Notification (for configuration see 2.6.2.1, “Single Point-of-Failure Hardware Component Recovery” on page 46), or, if the network_down event has been customized, bring down css0 (ifconfig css0 down) or fence out NodeF from the Control Workstation (Efence NodeF).
• Verify that all sharedvg file systems and paging spaces are accessible ( df -k and lsps -a). 6.2.2 Node Failure / Reintegration The following sections deal with issues of node failure and reintegration. 6.2.2.1 AIX Crash Perform the following steps in the event of an AIX crash: • Check, by way of the verification commands, that all the Nodes in the cluster are up and running. • Optional: Prune the error log on NodeF (errclear 0).
• Verify that failover has occurred ( netstat -i and ping for networks, lsvg -o and vi of a test file for volume groups, and ps -U for application processes). • Power cycle NodeF. If HACMP is not configured to start from /etc/inittab (on restart), start HACMP on NodeF (smit clstart). NodeF will take back its cascading Resource Groups.
• Monitor the cluster log files on NodeT. • Disconnect the network cable from the appropriate service and all the standby interfaces at the same time (but not the Administrative SP Ethernet) on NodeF. This will cause HACMP to detect a network_down event. • HACMP triggers events dependent on your configuration of the network_down event. By default, no action is triggered by the network_down event. • Verify that the expected action has occurred. 6.2.
• Reconnect hdisk0, close the casing, and turn the key to normal mode. • Power on NodeF then verify that the rootvg logical volumes are no longer stale (lsvg -l rootvg). 6.2.4.2 7135 Disk Failure Perform the following steps in the event of a disk failure: • Check, by way of the verification commands, that all the Nodes in the cluster are up and running. • Optional: Prune the error log on NodeF (errclear 0). • Monitor cluster logfiles on NodeT if HACMP has been customized to monitor 7135 disk failures.
• Monitor cluster logfiles on NodeT if HACMP has been customized to monitor 7133 disk failures. • Since the 7133 disk is hot pluggable, remove a disk from drawer 1 associated with NodeF's shared volume group. • The failure of the 7133 disk will be detected in the error log ( errpt -a | more) on NodeF, and the logical volumes with copies on that disk will be marked stale ( lsvg -l NodeFvg). • Verify that all NodeFvg file systems and paging spaces are accessible ( df -k and lsps -a).
142 IBM Certification Study Guide AIX HACMP
Chapter 7. Cluster Troubleshooting Typically, a functioning HACMP cluster requires minimal intervention. If a problem occurs, however, diagnostic and recovery skills are essential. Thus, troubleshooting requires that you identify the problem quickly and apply your understanding of the HACMP for AIX software to restore the cluster to full operation.
Log File Name Description system error log Contains time-stamped, formatted messages from all AIX subsystems, including the HACMP for AIX scripts and daemons. /usr/sbin/cluster/ history/cluster.mmdd Contains time-stamped, formatted messages generated by the HACMP for AIX scripts. The system creates a new cluster history log file every day that has a cluster event occurring. It identifies each day’s file by the filename extension, where mm indicates the month and dd indicates the day. /tmp/cm.
hang. After a certain amount of time, by default 360 seconds, the cluster manager will issue a config_too_long message into the /tmp/hacmp.out file. The message issued looks like this: The cluster has been in reconfiguration too long;Something may be wrong. In most cases, this is because an event script has failed. You can find out more by analyzing the /tmp/hacmp.out file.The error messages in the /var/adm/cluster.log file may also be helpful.
7.3.1 Tuning the System Using I/O Pacing Use I/O pacing to tune the system so that system resources are distributed more equitably during large disk writes. Enabling I/O pacing is required for an HACMP cluster to behave correctly during large disk writes, and it is strongly recommended if you anticipate large blocks of disk writes on your HACMP cluster. You can enable I/O pacing using the smit chgsys fastpath to set high- and low-water marks.
7.3.4 Changing the Failure Detection Rate Use the SMIT Change/Show a Cluster Network Module screen to change the failure detection rate for your network module only if enabling I/O pacing or extending the syncd frequency did not resolve deadman problems in your cluster. By changing the failure detection rate to “Slow”, you can extend the time required before the deadman switch is invoked on a hung node and before a takeover node detects a node failure and acquires a hung node’s resources.
and control messages so that the Cluster Manager has accurate information about the status of its partner. When a cluster becomes partitioned, and the network problem is cleared after the point when takeover processing has begun so that keepalive packets start flowing between the partitioned nodes again, something must be done to restore order in the cluster. This order is restored by the DGSP Message. 7.
7.6 User ID Problems Within an HACMP cluster, you always have more than one node potentially offering the same service to a specific user or a specific user id. As the node providing the service can change, the system administrator has to ensure that the same user and group is known to all nodes potentially running an application.
• Go from the simple to the complex. Make the simple tests first. Do not try anything complex and complicated until you have ruled out the simple and obvious. • Do not make more than one change at a time. If you do, and one of the changes corrects the problem, you have no way of knowing which change actually fixed the problem. Make one change, test the change, and then, if necessary, make the next change. • Do not neglect the obvious. Small things can cause big problems.
Chapter 8. Cluster Management and Administration This chapter covers all aspects of monitoring and managing an existing HACMP cluster. This includes a description of the different monitoring methods and tools available, how to start and stop the cluster, changing cluster or resource configurations, applying software fixes, user management, and other things. 8.1 Monitoring the Cluster By design, HACMP for AIX compensates for various failures that occur within a cluster.
Consult the HACMP for AIX, Version 4.3: Troubleshooting Guide, SC23-4280, for help if you detect a problem with an HACMP cluster. 8.1.1 The clstat Command HACMP for AIX provides the /usr/sbin/cluster/clstat command for monitoring a cluster and its components. The clstat utility is a clinfo client program that uses the Clinfo API to retrieve information about the cluster. Clinfo must be running on a node for this utility to work properly.
More details on how to configure HAView and on how to monitor your cluster with HAView can be found in Chapter 3, “Monitoring an HACMP cluster” in HACMP for AIX, Version 4.3: Administration Guide, SC23-4279. 8.1.3 Cluster Log Files HACMP for AIX writes the messages it generates to the system console and to several log files. Because each log file contains a different subset of the types of messages generated by HACMP for AIX, you can get different views of cluster status by viewing different log files.
8.1.3.5 /tmp/cm.log Contains timestamped, formatted messages generated by HACMP for AIX clstrmgr activity. This file is typically used by IBM support personnel. 8.1.3.6 /tmp/cspoc.log Contains timestamped, formatted messages generated by HACMP for AIX C-SPOC commands. The /tmp/cspoc.log file resides on the node that invokes the C-SPOC command. 8.1.3.7 /tmp/emuhacmp.out The /tmp/emuhacmp.out file records the output generated by the event emulator scripts as they execute. The /tmp/emuhacmp.
(C-SPOC) utility can be used to start and stop cluster services on all nodes in cluster environments. Starting cluster services refers to the process of starting the HACMP for AIX daemons that enable the coordination required between nodes in a cluster. Starting cluster services on a node also triggers the execution of certain HACMP for AIX scripts that initiate the cluster. Stopping cluster services refers to stopping these same daemons on a node.
8.2.1.4 Cluster Information Program daemon (clinfo) This daemon provides status information about the cluster to cluster nodes and clients and invokes the /usr/sbin/cluster/etc/clinfo.rc script in response to a cluster event. The clinfo daemon is optional on cluster nodes and clients. However, it is a prerequisite for running the clstat utility. With RSCT (RISC System Cluster Technology) on HACMP/ES Version 4.3, there are several more daemons. 8.2.1.
are started in sequential order - not in parallel. The output of the command run on the remote node is returned to the originating node. Because the command is executed remotely, there can be a delay before the command output is returned. 8.2.2.1 Automatically Restarting Cluster Services You can optionally have cluster services start whenever the system is rebooted. If you specify the -R flag to the rc.cluster command, or specify restart or both in the Start Cluster Services SMIT screen, the rc.
node. Because the command is executed remotely, there can be a delay before the command output is returned. 8.2.3.1 When to Stop Cluster services You typically stop cluster services in the following situations: • Before making any hardware or software changes or other scheduled node shutdowns or reboots. Failing to do so may cause unintended cluster events to be triggered on other nodes. • Before certain reconfiguration activity.
prevents unpredictable behavior from corrupting the data on the shared disks. See the clexit.rc man page for additional information. Important Note Never use the kill -9 command on the clstrmgr daemon. Using the kill command causes the clstrmgr daemon to exit abnormally. This causes the SRC to run the /usr/sbin/cluster/utilities/clexit.rc script which halts the system immediately, causing the surviving nodes to initiate failover. 8.2.
8.3 Replacing Failed Components From time to time, it will be necessary to perform hardware maintenance or upgrades on cluster components. Some replacements or upgrades can be performed while the cluster is operative, while others require planned downtime. Make sure you plan all the necessary actions carefully. This will spare you a lot of trouble. 8.3.1 Nodes When maintaining or upgrading a node, cluster services must usually be stopped on the node.
• The new adapter must be of the same type or a compatible type as the replaced adapter. • When replacing or adding an SCSI adapter, remove the resistors for shared buses. Furthermore, set the SCSI ID of the adapter to a value different than 7. 8.3.3 Disks Disk failures are handled differently according to the capabilities of the disk type and the HACMP version you are running.
4. Logically remove the disk from the system (rmdev -l hdiskX -d; rmdev -l pdiskY -d if a SSA disk) on all nodes. 5. Physically remove the failed disk and replace it with a new disk. 6. Add the disk to the ODM (mkdev or cfgmgr) on all nodes. 7. Add the disk to the shared volume group ( extendvg). 8. Increase the number of LV copies to span across the new disk (mklvcopy). 9. Synchronize the volume group ( syncvg) Note Steps 10 and 11 are only necessary in HACMP versions prior to 4.2. With HACMP 4.
8.4 Changing Shared LVM Components Changes to VG constructs are probably the most frequent kind of changes to be performed in a cluster.
When changing shared LVM components manually, you will usually need to run through the following procedure: 1. Stop HACMP on the node owning the shared volume group (sometimes a stop of the applications using the shared volume group may be sufficient). 2. Make the necessary changes to the shared LVM components. 3. Unmount all the file systems of the shared volume group. 4. Varyoff the shared volume group. 5. Export the old volume group definitions on the next node. 6.
Lazy Update has some limitations, which you need to consider when you rely on Lazy Update in general: • If the first disk in a sharedvg has been replaced, the importvg command will fail as Lazy Update expects to be able to match the hdisk number for the first disk to a valid PVID in the ODM. • Multi-LUN support on the SCSI RAID cabinets can be very confusing to Lazy Update as each LUN appears as a new hdisk known to only one node in the cluster (remember that Lazy Update works on LVM constructs). 8.4.
• Shared volume groups • List all volume groups in the cluster. • Import a volume group (with HACMP 4.3 only). • Extend a volume group (with HACMP 4.3 only). • Reduce a volume group (with HACMP 4.3 only). • Mirror a volume group (with HACMP 4.3 only). • Unmirror a volume group (with HACMP 4.3 only). • Synchronize volume group mirrors (with HACMP 4.3 only). • Shared logical volumes • List all logical volumes by volume group. • Add a logical volume to a volume group (with HACMP 4.3 only).
To use the SMIT shortcuts to C-SPOC, type smit cl_lvm or smit cl_conlvm for concurrent volume groups. Concurrent volume groups must be varied on in concurrent mode to perform tasks. 8.4.4 TaskGuide The TaskGuide is a graphical interface that simplifies the task of creating a shared volume group within an HACMP cluster configuration.
To change the nodes associated with a given resource group, or to change the priorities assigned to the nodes in a resource group chain, you must redefine the resource group. You must also redefine the resource group if you add or change a resource assigned to the group. This section describes how to add, change, and delete a resource group. 8.5.1 Add/Change/Remove Cluster Resources You can add, change and remove a resource group in an active cluster.
• If the Cluster Manager is active on the local node, synchronization triggers a cluster-wide, dynamic reconfiguration event. In dynamic reconfiguration, the configuration data stored in the DCD is updated on each cluster node, and, in addition, the new ODM data replaces the ODM data stored in the ACD (Active Configuration Directory) on each cluster node. The cluster daemons are refreshed and the new configuration becomes the active configuration.
8.5.3.1 Resource Migration Types Before performing a resource migration, decide if you will declare the migration sticky or non-sticky. Sticky Resource Migration A sticky migration permanently attaches a resource group to a specified node. The resource group attempts to remain on the specified node during a node failover or reintegration. Since stickiness is a behavioral property of a resource group, assigning a node as a sticky location makes the specified resource group a sticky resource.
INACTIVE_TAKEOVER flag set to false and has not yet started because its primary node is down. In general, however, only rotating resource groups should be migrated in a non-sticky manner. Such migrations are one-time events and occur similar to normal rotating resource group flavors. After migration, the resource group immediately resumes a normal rotating resource group failover policy, but from the new location. Note The cldare command attempts to perform all requested migrations simultaneously.
If you do not include a location specifier in the location field, the DARE Resource Migration utility performs a default migration, again making the resources available for reacquisition. Note A default migration can be used to start a cascading resource group that has INACTIVE_TAKEOVER set to false and that has not yet started because its primary node is down.
Note that you cannot add nodes to the resource group list with the DARE Resource Migration utility. This task is performed through SMIT. Stopping Resource Groups If the location field of a migration contains the keyword stop instead of an actual nodename, the DARE Resource Migration utility attempts to stop the resource group, which includes taking down any service label, unmounting file systems, and so on.
Be aware that persistent sticky location markers are saved and restored in cluster snapshots. You can use the clfindres command to find out if sticky markers are present in a resource group. If you want to remove sticky location markers while the cluster is down, the default keyword is not a valid method, since it implies activating the resource.
5. Restart the HACMP for AIX software on the node using the smit clstart fastpath and verify that the node successfully joined the cluster. 6. Repeat Steps 1 through 5 on the remaining cluster nodes. Figure 15 below shows the procedure: Fallover of System A System A rejoins the cluster System A System A System B System B System A Apply PTFs to System A System B Test System A System A System B System A's Resource System B's Resource Figure 15.
• Cluster nodes should be running the same HACMP maintenance levels. There might be incompatibilities between various maintenance levels of HACMP, so you must ensure that consistent levels are maintained across all cluster nodes. The cluster must be taken down to update the maintenance levels. 8.7 Backup Strategies HACMP software masks hardware failures in clustered RISC System/6000 environments by quickly switching over to backup machines or other redundant components.
8.7.1.1 How to do a split-mirror backup This same procedure can be used with just one mirrored copy of a logical volume. If you remove a mirrored copy of a logical volume (and file system), and then create a new logical volume (and file system) using the allocation map from that mirrored copy, your new logical volume and file system will contain the same data as was in the original logical volume.
9. After the backup is complete and verified, unmount and delete the new file system and the logical volume you used for it. 10.Use the mklvcopy command to add back the logical volume copy you previously split off to the fslv logical volume. 11.Resynchronize the logical volume. Once the mirror copy has been recreated on the logical volume, the syncvg command will resynchronize all physical partitions in the new copy, including any updates that have occurred on the original copy during the backup process.
they don’t match, the user won’t get anything done after a failover happened. So, the administrator has to keep definitions equal throughout the cluster. Fortunately, the C-SPOC utility, as of HACMP Version 4.3 and later, does this for you. When you create a cluster group or user using C-SPOC, it makes sure that it has the same group id or user id throughout the cluster. 8.8.
To add a user on one or more nodes in a cluster, you can either use the AIX mkuser command in a rsh to one clusternode after the other, or use the C-SPOC cl_mkuser command or the Add a User to the Cluster SMIT screen. The cl_mkuser command calls the AIX mkuser command to create the user account on each cluster node you specify. The cl_mkuser command creates a home directory for the new account on each cluster node. 8.8.
To remove a user account from one or more cluster nodes, you can either use the AIX rmuser command on one cluster node after the other, or use the C-SPOC cl_rmuser command or the C-SPOC Remove a User from the Cluster SMIT screen. The cl_rmuser command executes the AIX rmuser command on all cluster nodes. Note The system removes the user account but does not remove the home directory or any files owned by the user.
182 IBM Certification Study Guide AIX HACMP
Chapter 9. Special RS/6000 SP Topics This chapter will introduce you to some special topics that only apply if you are running HACMP on the SP system. 9.1 High Availability Control Workstation (HACWS) If you are thinking about what could happen to your SP whenever the Control Workstation might fail, you will probably think about installing HACWS for that. These paragraphs will not explain HACWS in full detail, but will concentrate on the most important issues for installation and configuration.
need to have the frame supervisors support dual tty lines in order to get both control workstations connected at the same time. Contact your IBM representative for the neccessary hardware (see Figure 16 on page 184). Both the tty network and the RS/6000 SP internal ethernet are extended to the backup cws. In contrast to standard HACMP, you don’t need to have a second ethernet adapter on the backup cws. In case you have only one, the HACWS software will work with ip aliasing addresses on one adapter.
The backup cws has to be installed with the same level of AIX and PSSP. Depending on the kerberos configuration of the primary cws, the backup cws has to be configured either as a secondary authentication server for the authentication realm of your RS/6000 SP when the primary cws is an authentication server itself, or as an authentication client when the primary cws is an authentication client of some other server.
ordinary HACMP cluster, as it is described in Chapter 7 of the HACMP for AIX, Version 4.3: Installation Guide , SC23-4278. Now the cluster environment has to be configured. Define a cluster ID and name for your HACWS cluster and define the two nodes to HACMP. Adapters have to be added to your cluster definition as described before. You will have to add a boot adapter and a service adapter for both primary and backup cws.
After that, identify the HACWS event scripts to HACMP by executing the /usr/sbin/hacws/spcw_addevents command, and verify the configuration with the /usr/sbin/hacws/hacws_verify command. You should also check the cabling from the backup cws with the /usr/sbin/hacws/spcw_verify_cabling command. Then reboot the primary and the backup cws, one after the other, and start cluster services on the primary cws with smit clstart.
Kerberos Also spelled Cerberus - The watchdog of Hades, whose duty was to guard the entrance (against whom or what does not clearly appear); it is known to have had three heads. - Ambrose Bierce, The Enlarged Devil’s Dictionary The following is simply a shortened description on how kerberos works. For more details, the redbook Inside the RS/6000 SP, SG24-5145, covers the subject in much more detail.
allow the clients to get service tickets to be used with other servers without the need to give them the password every time they request services. So, given a user has a ticket-granting ticket, if a user requests a kerberized service, he has to get a service ticket for it. In order to get one, the kerberized command sends an encrypted message, containing the requested service name, the machine’s name, and a time-stamp to the Kerberos server.
After setting the cluster’s security settings to enhanced for all these nodes, you can verify that it is working as expected, for example, by running clverify, which goes out to the nodes and checks the consistency of files. 9.3 VSDs - RVSDs VSDs (Virtual Shared Disks) and RVSDs (Recoverable Virtual Shared Disks) are SP-specific facilities that you are likely to use in an HACMP environment. 9.3.
With reference to Figure 17 above, imagine two nodes, Node X and Node Y, running the same application. The nodes are connected by the switch and have locally-attached disks. On Node X’s disk resides a volume group containing the raw logical volume lv_X. Similarly, Node Y has lv_Y. For the sake of illustration, let us suppose that lv_X and lv_Y together constitute an Oracle Parallel Server database to which the application on each node makes I/O requests.
The VSDs in this scenario are mapped to the raw logical volumes lv_X and lv_Y. Node X is a client of Node Y’s VSD, and vice versa. Node X is also a direct client of its own VSD (lv_X), and Node Y is a direct client of VSD lv_Y. VSD configuration is flexible. An interesting property of the architecture is that a node can be a client of any other node’s VSD(s), with no dependency on that client node owning a VSD itself.
impact of servicing a local I/O request through VSD relative to the normal VMM/LVM pathway is very small. IBM supports any IP network for VSD, but we recommend the switch for performance. VSD provides distributed data access, but not a locking mechanism to preserve data integrity. A separate product such as Oracle Parallel Server must provide the global locking mechanism. 9.3.2 Recoverable Virtual Shared Disk Recoverable Virtual Shared Disk (RVSD) adds availability to VSD.
operation that was in progress, as well as new I/O operations against rvsd_X, are suspended until failover is complete. When Node X is repaired and rebooted, RVSD switches the rvsd_X back to its primary, Node X. The RVSD subsystems are shown in Figure 20 on page 194. The rvsd daemon controls recovery. It invokes the recovery scripts whenever there is a change in the group membership, which it is recognizing through the use of Group Services, which in turn relies on information from Topology Services.
9.4 SP Switch as an HACMP Network One of the fascinating things with an RS/6000 SP is the switch network. It has developed over time; so, currently there are two types of switches at customer sites. The “older” HPS or HiPS switch (High Performance Switch), also known as the TB2 switch, and the “newer” SP Switch, also known as the TB3 switch. The HPS switch is no longer supported with PSSP Version 3.1, and the same applies to HACMP/ES Version 4.3.
9.4.2 Eprimary Management The SP switch has an internal primary backup concept, where the primary node, known as the Eprimary, is backed up automatically by a backup node. So, in case any serious failure happens on the primary, it will resign from work, and the backup node will take over the switch network handling, keeping track of routes, working on events, and so on. HACMP/ES used to have an Eprimary management function with versions below 4.3; so, if you upgrade to Version 4.
In case this node was the Eprimary node on the switch network, and it is an SP switch, then the RS/6000 SP software would have chosen a new Eprimary independently from the HACMP software as well.
198 IBM Certification Study Guide AIX HACMP
Chapter 10. HACMP Classic vs. HACMP/ES vs. HANFS So, why would you prefer to install one version of HACMP instead of another? This chapter summarizes the differences between them, to give you an idea in which situation one or the other best matches your needs. The certification test itself does not refer to these different HACMP flavors, but it is useful to know the differences anyway. The following paragraphs are based on the assumption that you are using Version 4.3.
handling membership and event management by using heartbeats. On the SP, the original High Availability infrastructure was built on this technology, and HACMP/ES Version 4.3. is now another instance relying on it. As of AIX 4.3.2 and PSSP 3.1, the High Availability infrastructure, which previously was tightly coupled to PSSP, was externalized into a package called RISC System Cluster Technology (RSCT). This package can be installed and run, not only on SP nodes, but also on regular RS/6000 systems.
See Part 4 of HACMP for AIX, Version 4.3: Enhanced Scalability Installation and Administration Guide, SC23-4284, for more information on these services. 10.2.2 Enhanced Cluster Security With HACMP Version 4.3 comes an option to switch security Mode between Standard and Enhanced. Standard Synchronization is done through the /.rhosts remote command facilities.
10.4 Similarities and Differences All three products have the basic structure in common. They all use the same concepts and structures. So, a cluster or a network, in the HACMP context, is the same, no matter what product is being used. There is always a Cluster Manager controlling the node, keeping track of the cluster’s status, and triggering events. The differences are in the technologies being used underneath, or in some special cases, the features available.
For switchless RS/6000 SP systems or SPs with the newer SP Switch, the decision will be based on a more functional level. Event Management is much more flexible in HACMP/ES, since you can define custom events. These events can act on anything that haemd can detect, which is virtually anything measurable on an AIX system. How to customize events is explained in great detail in the redbook HACMP Enhanced Scalability, SG24-2081.
204 IBM Certification Study Guide AIX HACMP
Appendix A. Special Notices This publication is intended to help System Administrators, System Engineers and other System Professionals to pass the IBM HACMP Certification Exam. The information in this publication is not intended as the specification for any of the following programming interfaces: HACMP, HACMP/ES, HANFS or HACWS. See the PUBLICATIONS section of the IBM Programming Announcement for those products for more information about what publications are considered to be product documentation.
been reviewed by IBM for accuracy in a specific situation, there is no guarantee that the same or similar results will be obtained elsewhere. Customers attempting to adapt these techniques to their own environments do so at their own risk. Any pointers in this publication to external Web sites are provided for convenience only and do not in any manner serve as an endorsement of these Web sites.
Java and HotJava are trademarks of Sun Microsystems, Incorporated. Microsoft, Windows, Windows NT, and the Windows 95 logo are trademarks or registered trademarks of Microsoft Corporation. PC Direct is a trademark of Ziff Communications Company and is used by IBM Corporation under license. Pentium, MMX, ProShare, LANDesk, and ActionMedia are trademarks or registered trademarks of Intel Corporation in the U.S. and other countries. Network File System and NFS are trademarks of SUN Microsystems, Inc.
208 IBM Certification Study Guide AIX HACMP
Appendix B. Related Publications The publications listed in this section are considered particularly suitable for a more detailed discussion of the topics covered in this redbook. B.1 International Technical Support Organization Publications For information on ordering these ITSO publications see “How to Get ITSO Redbooks” on page 211.
B.
How to Get ITSO Redbooks This section explains how both customers and IBM employees can find out about ITSO redbooks, CD-ROMs, workshops, and residencies. A form for ordering books and CD-ROMs is also provided. This information was current at the time of publication, but is continually subject to change. The latest information may be found at http://www.redbooks.ibm.com/.
How Customers Can Get ITSO Redbooks Customers may request ITSO deliverables (redbooks, BookManager BOOKs, and CD-ROMs) and information about redbooks, workshops, and residencies in the following ways: • Online Orders – send orders to: In United States In Canada Outside North America IBMMAIL usib6fpl at ibmmail caibmbkz at ibmmail dkibmbsh at ibmmail Internet usib6fpl@ibmmail.com lmannix@vnet.ibm.com bookshop@dk.ibm.
IBM Redbook Order Form Please send me the following: Title First name Order Number Quantity Last name Company Address City Postal code Country Telephone number Telefax number VAT number Card issued to Signature Invoice to customer number Credit card number Credit card expiration date We accept American Express, Diners, Eurocard, Master Card, and Visa. Payment by credit card not available in all countries. Signature mandatory for credit card payment.
214 IBM Certification Study Guide AIX HACMP
List of Abbreviations AIX Advanced Interactive Executive GODM Global Object Data Manager APA All Points Addressable GUI APAR Authorized Program Analysis Report Graphical User Interface HACMP High Availability Cluster Multi-Processing HANFS High Availability Network File System HCON Host Connection Program The description of a problem to be fixed by IBM defect support. This fix is delivered in a PTF (see below).
NETBIOS Network Basic Input/Output System NFS Network File System NIM Network Interface Module (This is the definition of NIM in the HACMP context. NIM in the AIX 4.1 context stands for Network Installation Manager). NIS Network Information Service NVRAM Non-Volatile Random Access Memory ODM Object Data Manager POST Power On Self Test PTF Program Temporary Fix A fix to a problem described in an APAR (see above).
Index Symbols /.rhosts file editing 59 /etc/hosts file and adapter label 38 /sbin/rc.
DGSP message 148 Disk Capacities 19 Disk Failure 139 dual-network 36 Dynamic Reconfiguration 169 heartbeats 11 home directories 49 Hot Standby Configuration 30 hot standby configuration 30 I E editing /.
Network Topology 35 networks point-to-point 36 NFS mounting filesystems 126 takeover issues 126 NFS cross mount 41 NFS Exports 41 NFS mount 41 NIM 199 NIS 58 Node Events 117 Node Failure / Reintegration 137 Node isolation 147 node relationships 108 non-concurrent access quorum 90 Non-Sticky Resource Migration 170 P partitioned cluster 147 password 49 point-to-point connection 36 principal 188 private network 38 PTFs) 174 public network 37 R RAID on SSA Disks 72 RAID on the 7133 Disk Subsystem 24 RAID vs.
Token-Ring 13 Topology Service 200 topsvcsd 156 U Upgrading 96 user accounts adding 179 changing 180 creating 179 removing 180 User and Group IDs 48 V VGDA 88 VGSA 88 Virtual Shared Disk (VSDs) 190 X xhacmpm 101 220 IBM Certification Study Guide AIX HACMP
ITSO Redbook Evaluation IBM Certification Study Guide AIX HACMP SG24-5131-00 Your feedback is very important to help us maintain the quality of ITSO redbooks. Please complete this questionnaire and return it using one of the following methods: • Use the online evaluation form found at http://www.redbooks.ibm.com • Fax this form to: USA International Access Code + 1 914 432 8264 • Send your comments in an Internet note to redbook@us.ibm.
SG24-5131-00 Printed in the U.S.A.