Building a disaster-proof data center with HP Serviceguard for Linux Introduction......................................................................................................................................... 2 Evaluating the need for disaster tolerance............................................................................................... 3 What is a disaster tolerant architecture? .................................................................................................
Introduction Linux remains the fastest-growing operating system in the world for a good reason. It has enabled countless companies worldwide to transform their IT strategy, capitalize on business-critical functionality, reduce IT implementation and maintenance expenditures, and achieve more flexibility for competitive preparedness over the long term. Today, many IT environments use Linux for an increasing number of disparate workloads—HP included.
Tip: You can view the Disaster Proof video at www.hp.com/go/DisasterProof Evaluating the need for disaster tolerance Disaster tolerance is the ability to restore applications and data within a reasonable period of time after a disaster. Fire, flood, and earthquake are most common disasters, but a disaster can be any event that unexpectedly interrupts service or corrupts data in an entire data center, such as a backhoe that digs too deep and severs a network connection or an act of sabotage.
Figure 1. High availability architecture This architecture, which is typically implemented on one site in a single data center, is sometimes called a local cluster. For some installations, the level of protection given by a local cluster is insufficient. Consider the order-processing center where power outages are common during harsh weather.
Figure 2. Disaster tolerant architecture Understanding types of disaster-tolerant clusters To protect against multiple points of failure, cluster components must be geographically dispersed: nodes can be put in different rooms, on different floors of a building, or even in separate buildings or cities. The distance between the nodes is dependent on the types of disaster from which you need protection and on the technology used to replicate data.
Extended distance clusters Note: Extended distance clusters were formerly known as campus clusters, but that term is not always appropriate because the supported distances have increased beyond the typical size of a single corporate campus. An extended distance cluster (also known as an extended campus cluster) is a normal Serviceguard cluster that has alternate nodes located in different data centers, separated by some distance with a third location supporting the quorum service.
Figure 3. Extended distance cluster In the previous configuration the network and Fibre Channel links between the data centers are combined and sent over common DWDM links. Two DWDM links provide redundancy. When one of them fails, the other can still be active and can keep the two data centers connected. Using the DWDM link, clusters can now be extended to greater distances, which was not possible earlier due to limits imposed by the Fibre Channel link for storage and Ethernet for networks.
Figure 4. Two data center setup Figure 4 shows a configuration that is supported with separate network and Fibre Channel links between the data centers. In this configuration, the Fibre Channel links and the Ethernet networks are not carried over DWDM links. But each of these links is duplicated between the two data centers, for redundancy.
Benefits of Extended Distance Cluster The following table discusses the benefits of Extended Distance Cluster. Benefits of Extended Distance Cluster This configuration implements a single Serviceguard cluster across two data centers, and uses Multiple Device (MD) driver for data replication. You can choose any mix of Fibre Channel-based storage supported by Serviceguard that also supports the QLogic driver multipath feature.
Figure 5. Serviceguard for Linux and Cluster Extension disaster proof demonstration architecture HP StorageWorks XP Cluster Extension Software offers protection against system downtime to critical applications for enterprise customers using the HP StorageWorks XP Disk Array family. HP Cluster Extension enables hands-free failover and failback decision-making as it detects failures and automatically manages recovery without human intervention.
server cluster. The correct failover and failback decisions are made automatically, minimizing downtime and accelerating recovery. Benefits of Cluster Extension Benefits of Cluster Extension Cluster Extension offers a more resilient solution than Extended Distance Cluster, as it provides complete integration between the Serviceguard application package and the data replication subsystem. The storage subsystem is queried to determine the state of the data on the arrays.
Differences Between Extended Distance Cluster and Cluster Extension The major differences between an Extended Distance Cluster and Cluster Extension are: • A key difference between extended distance clusters and HP Cluster Extension is the data replication technology used. The two basic methods available for replicating data between the data centers for Linux clusters are either host-based or storage array-based. Extended Distance Cluster always uses host-based replication (MD software mirroring on Linux).
Attributes Extended Distance Cluster CLX Key Benefit Excellent in “normal” operations, and partial failure. Since all hosts have access to both disks, in a failure where the node is running and the application is up, but the disk becomes unavailable, no failover occurs. The node will access the remote disk to continue processing. Two significant benefits: • Provides maximum data protection. State of the data is determined before application is started.
Conclusion The business-continuity and availability solutions of HP empower customers to improve their business performance and protect their corporate reputations with resilient IT. The health of your business depends on access to critical IT services and information. Virtually any amount of IT downtime can mean lost productivity, lost revenue, lost customers, lost opportunities. That means you need to be prepared for the full range of threats to the availability and stability of your core infrastructure.
For more information Main website: http://www.hp.com/go/DisasterProof Business Continuity & Availability: http://www.hp.com/go/continuityandavailability Disaster tolerance: http://www.hp.com/go/disastertolerant Bulletproof XP demonstration: http://www.hp.com/go/storageworks/bulletproofxp HP Serviceguard for high availability and disaster tolerance: http://www.hp.com/go/ha Serviceguard for Linux: http://www.hp.com/go/sglx HP Open Source and Linux: http://www.hp.