Deploying Highly Available SAP ® Servers using Red Hat® Cluster Suite 1801 Varsity Drive Raleigh NC 27606-2072 USA Phone: +1 919 754 3700 Phone: 888 733 4281 Fax: +1 919 754 3701 PO Box 13588 Research Triangle Park NC 27709 USA The following terms used in this publication are trademarks of other companies as follows: Linux is a registered trademark of Linus Torvalds Red Hat, Red Hat Enterprise Linux and the Red Hat "Shadowman" logo are registered trademarks of Red Hat, Inc.
Table of Contents 1 Executive Summary...............................................................................................................7 1.1 Introduction....................................................................................................................7 1.2 Audience........................................................................................................................8 1.3 Acronyms..................................................................................
5.1.2 ACPI.....................................................................................................................19 5.1.3 Firewall.................................................................................................................20 5.2 Network Configuration..................................................................................................20 5.2.1 Public/Private Networks........................................................................................20 5.2.
6.1.5.4 SAP Release-specific Post-processing..............................................................33 6.1.5.5 Before Starting the Cluster.................................................................................33 6.1.6 Enqueue Replication Server.................................................................................33 6.2 Local Root and Shared Storage with GFS...................................................................33 6.2.1 SAP Architecture..................................
7.4.3 FS.........................................................................................................................44 7.4.4 SAPInstance.........................................................................................................45 7.4.5 SAPDatabase.......................................................................................................49 7.5 Dependencies..............................................................................................................52 7.
1 Executive Summary This paper details the deployment of a highly available SAP service on a Red Hat Enterprise Linux 5 cluster. After an introduction to the basic concepts and system requirements, this document will provide detailed information about the Red Hat Cluster Suite (RHCS), SAP NetWeaver, and cluster configuration options. 1.1 Introduction A cluster is essentially a group of two or more computers working together which, from an end users perspective, appear as one server.
1.2 Audience This document addresses SAP certified technical consultants for SAP NetWeaver with experience in HA systems. Access to SAP information resources such as SAP Marketplace is mandatory. 1.3 Acronyms Common acronyms referenced within this document are listed below.
RHCS Red Hat Cluster Suite RHEL Red Hat Enterprise Linux RIND Rind Is Not Dependencies SAN Storage Area Network SCS SAP Central Services Instance (for Java) SPOF Single Point Of Failure SSI Single System Image VFS Virtual File System 1.4 Reference Documentation The following list includes the existing documentation and articles referenced by this document. Red Hat Enterprise Linux Installation Guide http://www.redhat.com/docs/enUS/Red_Hat_Enterprise_Linux/5.
1.5 SAP Overview In an SAP NetWeaver environment, these single points of failure (SPOF) must be considered: Database SAP Central Services Instance (SCS/ASCS) SAP System Mount Directory (/sapmnt/) The SAP example system in the above illustration is a double stack with Enqueue Replication, both for ASCS and SCS. Although they are not SPOF, the Enqueue Replication Servers (ERS) are controlled by the cluster software.
To ensure that the (A)SCS "follows" the ERS instance, the follow-service dependency was implemented in RHCS. The SAP System Mount Directory should be exported by a highly available NFS server and mounted by the cluster software. 1.6 Cluster Technology Overview For applications that require maximum system uptime, a Red Hat Enterprise Linux cluster with RHCS is the solution.
Cluster Servers: Storage: (2) Fujitsu Siemens Rx220 EMC CLARiiON SAN Infrastructure: QLogic SAP Installation: SAP NetWeaver 2004, WebAS ABAP on MaxDB SAP NetWeaver 7.0, WebAS ABAP+JAVA on Oracle 3 Hardware Requirements 3.1 Shared Storage Requirements Shared storage indicates external storage accessible by every cluster member.
3.3 Network Requirements There should be at least two Network Interface Cards (NIC), whether embedded or added to each server. Where multiple network interfaces are available, NIC bonding can be implemented for additional availability and is the only current method providing a NIC failover ability. One bonded interface will be configured with an external IP address while the other will be configured as an interconnect between cluster members using local network connectivity.
In the event of a major network problem, cluster partitioning (aka: split-brain situation) can occur. Each partition can no longer communicate with nodes outside its own partition. A Red Hat cluster requires the quorum requirement be fulfilled before a status change in the cluster is allowed. For example, quorum is required by the resource management system to relocate cluster resources or for the CMAN module to remove nodes from the cluster.
The quorum disk daemon (qdiskd) runs on each node in the cluster, periodically evaluating its own health and then placing its state information into an assigned portion of the shared disk area. Each qdiskd then looks at the state of the other nodes in the cluster as posted in their area of the QDisk partition. When in a healthy state, the quorum of the cluster adds the vote count for each node plus the vote count of the qdisk partition.
GFS runs on each node in a cluster. As with all file systems, it is basically a kernel module that runs on top of the Virtual File System (VFS) layer of the kernel. It controls how and where the data is stored on a block device or logical volume. In order for cluster members to cooperatively share the data on a SAN, GFS is required to coordinate a distributed locking protocol. 4.6 DLM The Distributed Lock Manager (DLM) is a cluster locking protocol in the form of a kernel module.
Please note, that the SCSI fencing mechanism requires persistent SCSI reservations. Please contact Red Hat technical support and your storage hardware vendor if your software and hardware configuration supports persistent SCSI reservations. 4.8 CLVM Consistency must be ensured in all cluster configurations. Logical volume configurations are protected by the use of CLVM. CLVM is an extension to standard Logical Volume Management (LVM) that distributes LVM metadata updates to the cluster.
On a clustered volume group, the following command can be used to create a cluster aware mirror: # lvcreate -m1 -L 1G -n my_new_lv my_vg 4.10 Cluster Resource Manager The Cluster Resource Manager (rgmanager) manages and provides failover capabilities for cluster resource groups. It controls the handling of user requests including service start, restart, disable, and relocate. The service manager daemon also handles restarting and relocating services in the event of failures.
Compared to other cluster concepts, the management and operation is straightforward. With classic application clusters, the complexity of management is proportional to the number of nodes in the cluster. Any change must be rolled out on every node. With a diskless shared root cluster, one can change information on any node and the change will be observed by all nodes automatically. No error-prone replication processes are required to submit changes on any node.
fence device, disable ACPI Soft-Off for that node. Otherwise, if ACPI Soft-Off is enabled, an integrated fence device can take four or more seconds to turn off a node (refer to note that follows). In addition, if ACPI Soft-Off is enabled and a node panics or freezes during shutdown, an integrated fence device may not be able to power off the node. Under those circumstances, fencing is delayed or unsuccessful.
5.2.3 Hosts file The /etc/hosts file for each cluster member should contain an entry defining localhost. If the external host name of the system is defined on the same line, the host name reference should be removed. Additionally, each /etc/hosts file should define the local interconnect of each cluster member. 5.3 Storage Configuration 5.3.1 Multipathing Storage hardware vendors offer different solutions for implementing a multipath failover capability.
5.3.3.1 LVM Configuration The LVM configuration file /etc/lvm/lvm.conf must be modified to enable the use of CLVM. 1. By default, the LVM commands scan all devices found directly in the /dev path. This is insufficient in dm-multipath configurations. There are two ways to enable multipath devices for LVM. The easiest is to modify the scan array in the configuration file as follows: scan = [ "/dev/mapper", "/dev/cciss" ] 2. All changes to logical volumes and their states are communicated using locks.
mechanism In a cluster setup, lockproto:=lock_dlm Further information about the gfs_mkfs option can be obtained from the gfs:mkfs(8) man page. The following example formats the lv_ci logical volume with GFS: # gfs_mkfs -j 3 -p lock_dlm -t lsrhc5:700_ci /dev/vg_sap700_gfs/lv_ci 5.3.4.2 fstab In this GFS based setup, the GFS file systems are an integrated part of the operating system; i.e., the file systems are defined in the /etc/fstab file and mounted on all cluster nodes during the boot process.
gpgkey=http://download.atix.de/yum/comoonics/comoonics-RPM-GPG.key Install the required open-sharedroot software packages: # yum install comoonics-bootimage comoonics-cdsl-py comoonics-ec-py comoonics-cs-xsl-ec 5.5 Cluster Core Configuration The cluster configuration file, /etc/cluster/cluster.conf (in XML format), for this cluster contains has the following outline: ...
2. Update the ccs cluster configuration: # ccs_tool update /etc/cluster/cluster.conf The tag should define the following attributes: Attribute config_version Name Description Version number of the configuration The name of the cluster 5.5.1 CMAN / OpenAIS The OpenAIS daemon aisexec is started and configured by CMAN. Typically, all work is performed within the cman init script.
Attribute Description to CMAN when it has a high enough score. log_level Controls the verbosity of the quorum daemon in the system logs. 0 = emergencies; 7 = debug. log_facility Controls the syslog facility used by the quorum daemon when logging. For a complete list of available facilities, see syslog.conf(5). The default value for this is daemon. min_score Absolute minimum score to be consider ones self "alive".
For more detailed information, refer to the qdisk man page. If device mapper multipath is used together with qdiskd, the values for tko and interval must be carefully considered. In the example case of a path failover, all storage I/O will be queued by the device mapper module. The qdisk timeout must be adapted to the possible device mapper's queuing time. 5.5.3 Fencing The fencing configuration consists of two parts. The first is the configuration of the fencing daemon (fenced) itself.
fencing mechanism are encapsulated within the tag. Each fencing mechanism is defined by the tag. Please refer to the man pages of fence as well as the man pages for the chosen fencing mechanisms for further details. 5.6 Local root Cluster Installation For a cluster with a local root file system configuration, the following steps must be performed on every cluster node: 1. Install the Red Hat Enterprise Linux 5 operating system 2. Install the required cluster packages 3.
After the installation of the first node is complete, the installation must be transferred to a shared root device and some modifications must be performed. For detailed installation steps, please reference the following documentation: NFS shared root http://open-sharedroot.org/documentation/rhel5-nfs-shared-root-mini-howto GFS shared root http://open-sharedroot.org/documentation/rhel5-gfs-shared-root-mini-howto Yum channel for shared root software packages http://open-sharedroot.
6.1.2 SAP Virtual IP Addresses SAP NetWeaver is typically installed via the graphical installation tool sapinst. Before beginning the installation, determine which IP addresses and host names are preferred for use during the SAP installation. First, each node requires a static IP address and an associated host name. This address is also referred to as the physical IP address . Second, each database and SAP instance will require a virtual IP address / host name.
Follow the database file system configuration recommendations from the SAP installation guide. It is recommended to have physically different mount points for the program files and for saplog and sapdata. Oracle Create /oracle/client/10x_64/instantclient locally on every node. See the post-processing section how to copy the binaries after the installation. Follow the database file system configuration recommendations from the SAP installation guide.
Depending on the Installation Master CD that was used for the SAP installation, the login profiles for the SAP administrator user (adm) and the database administrator user could differ. In older and non-HA installations, the user login profiles look similar to this one: .sapenv_hostname.csh Using the host name in the user login profiles is a problem in an HA environment. By default, the profiles .login, .profile and .
As a general requirement, the SAP parameter es/implementation must be set to "std" in the SAP DEFAULT.PFL file. See SAP Note 941735. The SAPInstance resource agent cannot use the AUTOMATIC_RECOVERY function for systems that have this parameter set to "map". In the START profiles, the parameter SAPSYSTEM must be set (default since 7.00). 6.1.5.4 SAP Release-specific Post-processing For improved SAP hardware key determination in high-availability scenarios of SAP Note 1178686. For SAP kernel release 4.
6.2.2 SAP Virtual IP Addresses SAP NetWeaver is typically installed via the graphical installation tool sapinst. Before beginning the installation, determine which IP addresses and host names are preferred for use during the SAP installation. First, each node requires a static IP address and an associated host name. This address is also referred to as the physical IP address . Second, each database and SAP instance will require a virtual IP address / host name.
The transport directory /usr/sap/trans should also be exported via NFS according to your SAP landscape. 6.2.3.4 Before Starting the SAP Installation Before installing SAP NetWeaver, mount all the file necessary systems. Be conscious of the overmount-effect by mounting the hierarchically highest directories first. 6.2.4 Installation with sapinst When starting the SAP installation tool sapinst, specify the virtual host name.
Copy the files /etc/oratab and /etc/oraInst.loc to the GFS mount point /oracle and create links to them. 6.2.5.3 SAP Profiles The most important SAP profile parameter for a clustered SAP system is SAPLOCALHOST. After the installation with sapinst, make sure that all SAP instance profiles contain this parameter. The value of the parameter must be the virtual host name specified during the installation. As a general requirement the SAP parameter es/implementation must be set to std in the SAP DEFAULT.
6.3 Shared Root and Shared Storage with GFS 6.3.1 SAP Architecture Following the established SAP documentation is highly recommended: SAP Installation Guide http://service.sap.com/instguides SAP Technical Infrastructure Guide https://www.sdn.sap.com/irj/sdn/ha 6.3.2 SAP Virtual IP Addresses SAP NetWeaver is typically installed via the graphical installation tool sapinst. Before beginning the installation, determine which IP addresses and host names are preferred for use during the SAP installation.
The database directories can also completely reside on GFS. Follow the database file system setup recommendations from the SAP installation guide. It is recommended to have physically different mount points for specific directories. The GFS mounts from shared storage must be added to /etc/fstab so they get mounted at system boot. 6.3.3.3 NFS Mounted File Systems The /sapmnt/ file system should resist on a high available NFS to be available to additional application server outside the cluster.
6.3.5.2 SAP Profiles The most important SAP profile parameter for a clustered SAP system is SAPLOCALHOST. After the installation with sapinst, ensure that all SAP instance profiles contain this parameter. The value of the parameter must be the virtual host name specified during the installation. As a general requirement the SAP parameter es/implementation must be set to std in the SAP DEFAULT.PFL file. See SAP Note 941735.
The following resource types will be defined to provide the high availability functionality for SAP. 7.2 Configuration The resource group manager is configured within the cluster configuration file /etc/cluster/cluster.conf. The configuration is encapsulated within the tag. The resource manager configuration has the following basic layout: ... ... ...
basic configuration schema: ... ... The failover domains can be configured in different ways.
7.4 Cluster Resources and Services There are many types of cluster resources that can be configured. Resources are bundled together to highly available services; i.e., a service consists of one or more cluster resources. Resources can be used by any cluster service that requires one. Once associated with a cluster service, it can be relocated by a cluster member if it deems it necessary, or manually through a GUI interface, a web interface (conga) or via command line.
7.4.1 IP The ip resource defines an ipv4 or ipv6 network address. The following attributes can be defined: Attribute address monitor_link Description Required IPv4 or IPv6 address to use as a virtual IP resource. Yes Enabling this causes the status verification to fail if the link on the NIC to which this IP address is bound is not present. No 7.4.2 Netfs The netfs resource defines an NFS or CIFS mount.
Attribute Description Required reference mountpoint Path within file system hierarchy at which to mount this file system. Yes device Block device, file system label, or UUID of file system. Yes fstype File system type. If not specified, mount(8) will attempt to determine the file system type. No force_unmount If set, the cluster will kill all processes using this file system when the resource group is stopped. Otherwise, the unmount will fail, and the resource group will be restarted.
standalone web dispatcher instance which will fail to work with the resource agent. The next version of the agent can have a parameter that could be used to select which services should be monitored. However, this does not mean that a SAP web dispatcher cannot be included in another SAP instance that uses one of the monitored services (e.g., a SCS instance running a msg_server and a enserver). In this case, the web dispatcher will be started and stopped (together with the other services) by the cluster.
Attribute Description Required after the default SAP installation. DEFAULT: /usr/sap///exe or /usr/sap//SYS/exe/run DIR_PROFILE The fully qualified path to the SAP START profile. No Specify this parameter, if you have changed the SAP profile directory location after the default SAP installation. DEFAULT: /usr/sap//SYS/profile START_PROFILE The name of the SAP START profile. No (Yes for Specify this parameter if the name of the SAP 7.
Attribute Description Required be more important that the ABAP instance be up and running. A failure of the JAVA instance will not cause a failover of the SAP instance. Actually, the SAP MC reports a YELLOW status if the JAVA instance of a double stack system fails. From the perspective of the resource agent, a YELLOW status indicates all is well. Setting START_WAITTIME to a lower value causes the resource agent to verify the status of the instance during a start operation after that time.
Attribute POST_STOP_ USEREXIT Description Required uses. Those programs can include by writing an OCF resource agent into the Heartbeat cluster. However, sometimes writing a resource agent is too much effort for this task. With the provided userexits, one can easily include their own scripts, that do not follow the OCF standard, into the cluster. Note that the returncode of said script will not be used by the SAPInstance resource agent.
Attribute Description DIR_EXECUTABLE The full qualified path to the SAP kernel. The resource agent requires the startdb and the R3trans executables. Required No For that reason, the directory with the SAP kernel must be accessible to the database server at any given time. Specify this parameter if the SAP kernel directory location was changed after the default SAP installation.
Attribute Description Required Not fort use with Oracle as it will result in unwanted failovers in the case of a stuck archiver. DEFAULT: false AUTOMATIC_ RECOVER The SAPDatabase resource agent tries to recover a failed start attempt automatically one time. This is achieved by performing a forced abort of the RDBMS and/or executing recovery commands. No DEFAULT: false DIR_BOOTSTRAP The full qualified path to the J2EE instance bootstrap directory. e.g.
Attribute Description Required This is required only if the DBJ2EE_ONLY parameter is set to true. It will be automatically read from the bootstrap.properties file in Java engine 6.40 and 7.00. For Java engine 7.10, the parameter is mandatory. Example: /oracle/client/10x_64/instantclient/libclnts h.
7.5 Dependencies 7.5.1 Resource Dependencies The resources within a cluster service follow two different dependency rules. First, the nesting within the service configuration defines startup order and resource dependencies. In the following example, resource2 depends on resource1. In addition, resource1 is started prior to starting resource2. Second, an implicit order and dependency is defined.
7.5.2.2 Follow Service Dependency The follow service dependency makes use of rgmanager's RIND event scripting mechanism. In order to activate the follow service dependency, central_processing must be enabled. Also, the following events must be defined within the tag: notice("Event service triggered!"); evalfile("/usr/share/cluster/follow-service.
8 Cluster Management 8.1 CMAN The basic cluster operation can be verified using the cman_tool utility 8.1.1 cman_tool status The cman_tool status command can be used to show the status of one cluster node: # cman_tool status Version: 6.1.
dlm [1 2] dlm [1] dlm [1 2] dlm [1 2] dlm [1 2] dlm [1 2] dlm [1 2] gfs [1 2] gfs [1 2] gfs [1 2] gfs [1 2] gfs [1 2] 1 root 00040001 none 1 clvmd 00020001 none 1 rhc_ascs 00020002 none 1 rhc_usrsap 00040002 none 1 rhc_ers 00060002 none 1 rhc_oracle 00080002 none 1 rgmanager 00090002 none 2 root 00030001 none 2 rhc_ascs 00010002 none 2 rhc_usrsap 00030002 none 2 rhc_ers 00050002 none 2 rhc_oracle 00070002 none 8.2 rgmanager 8.2.
8.2.2 clusvcadm The resource manager services can be controlled by the clusvcadm command. The basic operations are: clusvcadm -e -m starts service on member clusvcadm -r -m relocate service to member clusvcadm -d disables/stops service For detailed information, reference the clusvcadm(8) manpage. 8.2.
8.3.2 Update initrd In a sharedroot cluster, the cluster configuration file /etc/cluster/cluster.conf must be copied into the initrd. Therefore the process of updating the cluster configuration is combined with updating the initrd. 1. Copy the file /opt/atix/comoonics-cs/xsl/updateinitrd.xml to /etc/comonics/enterprisecopy/updateinitrd.xml 2. Adjust the settings of /etc/comonics/enterprisecopy/updateinitrd.xml to fit your needs, e.g., define the correct boot device:
enterprisecopy/localclone.xml For detailed information regarding the comoonics enterprise software solution, please consult the Comoonics Enterprise Copy section of the Open-Sharedroot Administrators Handbook. Appendix A: cluster.conf The following /etc/cluster/cluster.conf file content was used during testing.
PAGE 60evalfile("/usr/local/cluster/follow-service.sl"); follow_service("service:rhc_ascs", "service:rhc_ers", "service:rhc_ascs"); notice("Event node triggered!"); evalfile("/usr/local/cluster/follow-service.sl"); follow_service("service:rhc_ascs", "service:rhc_ers", "service:rhc_ascs");
Appendix B: multipath.conf The following /etc/multipath.conf file content was used during testing. ## This is the /etc/multipath.
## Device attributes for EMC CLARiiON device { vendor "DGC " product "*" getuid_callout "/sbin/scsi_id -g -u -s /block/%n" failback manual } # multipaths { multipath { wwid 360060160eda508008c55ad9b6a54db11 alias DGC_000 } multipath { wwid 360060160eda508008d55ad9b6a54db11 alias DGC_001 } multipath { wwid 360060160eda508003cb2f077e8f1db11 alias DGC_002 } multipath { wwid 360060160eda508008e55ad9b6a54db11 alias DGC_003 } multipath { wwid 360060160eda508008f55ad9b6a54db11 alias DGC_004 } multipath { wwid 3600
multipath { wwid 360060160eda50800a155ad9b6a54db11 alias DGC_012 } multipath { wwid 360060160eda508009f55ad9b6a54db11 alias DGC_013 } multipath { wwid 360060160eda508009d55ad9b6a54db11 alias DGC_014 } multipath { wwid 360060160eda508009055ad9b6a54db11 alias DGC_015 } multipath { wwid 360060160eda508009155ad9b6a54db11 alias DGC_016 } multipath { wwid 360060160eda508009255ad9b6a54db11 alias DGC_017 } multipath { wwid 360060160eda508009355ad9b6a54db11 alias DGC_018 } Appendix C: lvm.
scan = [ "/dev/mapper" ] # If several entries in the scanned directories correspond to the # same block device and the tools need to display a name for device, # all the pathnames are matched against each item in the following # list of regular expressions in turn and the first match is used. preferred_names = [ ] # preferred_names = [ "^/dev/mpath/", "^/dev/[hs]d" ] # # # # # # # A filter that tells LVM2 to only use a restricted set of devices. The filter consists of an array of regular expressions.
# It is safe to delete the contents: the tools regenerate it. # (The old setting 'cache' is still respected if neither of # these new ones is present.) cache_dir = "/etc/lvm/cache" cache_file_prefix = "" # You can turn off writing this cache file by setting this to 0. write_cache_state = 1 # Advanced settings. # List of pairs of additional acceptable block device types found # in /proc/devices with maximum (non-zero) number of partitions. # types = [ "fd", 16 ] # If sysfs is mounted (2.
# What level of log messages should we send to the log file and/or syslog? # There are 6 syslog-like log levels currently in use - 2 to 7 inclusive. # 7 is the most verbose (LOG_DEBUG). level = 0 # Format of output messages # Whether or not (1 or 0) to indent messages according to their severity indent = 1 # Whether or not (1 or 0) to display the command name on each line output command_names = 0 # A prefix to use before the message text (but after the command name, # if selected).
archive_dir = "/etc/lvm/archive" # What is the minimum number of archive files you wish to keep ? retain_min = 10 # What is the minimum time you wish to keep an archive file for ? retain_days = 30 } # Settings for the running LVM2 in shell (readline) mode. shell { # Number of lines of history to store in ~/.lvm_history history_size = 100 } # Miscellaneous global LVM2 settings global { library_dir = "/usr/lib64" # The file creation mask for any files and directories created.
"lvm2". # The command line override is -M1 or -M2. # Defaults to "lvm1" if compiled in, else "lvm2". # format = "lvm1" # Location of proc filesystem proc = "/proc" # Type of locking to use. Defaults to local file-based locking (1). # Turn locking off by setting to 0 (dangerous: risks metadata corruption # if LVM2 commands get run concurrently). # Type 2 uses the external shared library locking_library. # Type 3 uses built-in clustered locking.
# For now, you need to set this up yourself first (e.g., with 'dmsetup') # For example, you could make it return I/O errors using the 'error' # target or make it return zeros.
# Currently this is not implemented properly and behaves # similarly to: # # "allocate_anywhere" - Operates like "allocate", but it does not # require that the new space being allocated be on a # device is not part of the mirror. For a log device # failure, this could mean that the log is allocated on # the same device as a mirror device. For a mirror # device, this could mean that the mirror device is # allocated on the same device as another mirror device.
# Event daemon # # dmeventd { # mirror_library is the library used when monitoring a mirror device. # # "libdevmapper-event-lvm2mirror.so" attempts to recover from failures. # It removes failed devices from a volume group and reconfigures a # mirror as necessary. # # mirror_library = "libdevmapper-event-lvm2mirror.so" #} 70 | www.redhat.