Managing HP Serviceguard A.11.20.20 for Linux, March 2014

8.7.1 Reviewing Package IP Addresses ...............................................................................258
8.7.2 Reviewing the System Log File ..................................................................................259
8.7.2.1 Sample System Log Entries ................................................................................259
8.7.3 Reviewing Configuration Files ...................................................................................260
8.7.4 Reviewing the Package Control Script ........................................................................260
8.7.5 Using the cmquerycl and cmcheckconf Commands......................................................260
8.7.6 Reviewing the LAN Configuration .............................................................................261
8.8 Solving Problems ...........................................................................................................261
8.8.1 Name Resolution Problems.......................................................................................261
8.8.1.1 Networking and Security Configuration Errors......................................................261
8.8.2 Halting a Detached Package....................................................................................261
8.8.3 Cluster Re-formations Caused by Temporary Conditions...............................................262
8.8.4 Cluster Re-formations Caused by MEMBER_TIMEOUT Being Set too Low........................262
8.8.5 System Administration Errors ....................................................................................263
8.8.5.1 Package Control Script Hangs or Failures ...........................................................263
8.8.6 Package Movement Errors (Legacy Packages)..............................................................264
8.8.7 Node and Network Failures ....................................................................................265
8.8.8 Troubleshooting the Quorum Server...........................................................................265
8.8.8.1 Authorization File Problems...............................................................................265
8.8.8.2 Timeout Problems............................................................................................265
8.8.8.3 Messages.......................................................................................................266
8.8.9 Lock LUN Messages................................................................................................266
8.9 Troubleshooting serviceguard-xdc package........................................................................266
8.10 Troubleshooting Serviceguard Manager...........................................................................267
A Designing Highly Available Cluster Applications .......................................269
A.1 Automating Application Operation ...................................................................................269
A.1.1 Insulate Users from Outages .....................................................................................269
A.1.2 Define Application Startup and Shutdown ..................................................................270
A.2 Controlling the Speed of Application Failover ....................................................................270
A.2.1 Replicate Non-Data File Systems ...............................................................................270
A.2.2 Evaluate the Use of a Journaled Filesystem (JFS)..........................................................271
A.2.3 Minimize Data Loss ................................................................................................271
A.2.3.1 Minimize the Use and Amount of Memory-Based Data .........................................271
A.2.3.2 Keep Logs Small .............................................................................................271
A.2.3.3 Eliminate Need for Local Data .........................................................................271
A.2.4 Use Restartable Transactions ....................................................................................271
A.2.5 Use Checkpoints ....................................................................................................272
A.2.5.1 Balance Checkpoint Frequency with Performance ................................................272
A.2.6 Design for Multiple Servers .....................................................................................272
A.2.7 Design for Replicated Data Sites ..............................................................................273
A.3 Designing Applications to Run on Multiple Systems ............................................................273
A.3.1 Avoid Node Specific Information ..............................................................................273
A.3.1.1 Obtain Enough IP Addresses .............................................................................274
A.3.1.2 Allow Multiple Instances on Same System ...........................................................274
A.3.2 Avoid Using SPU IDs or MAC Addresses ...................................................................274
A.3.3 Assign Unique Names to Applications ......................................................................274
A.3.3.1 Use DNS .......................................................................................................274
A.3.4 Use uname(2) With Care ........................................................................................275
A.3.5 Bind to a Fixed Port ................................................................................................275
A.3.6 Bind to Relocatable IP Addresses .............................................................................275
A.3.6.1 Call bind() before connect() ..............................................................................276
A.3.7 Give Each Application its Own Volume Group ...........................................................276
A.3.8 Use Multiple Destinations for SNA Applications .........................................................276
12 Contents