Technical White Paper Dell EMC SC Series: Synchronous Replication and Live Volume Abstract This document provides descriptions and use cases for the Dell EMC SC Series data protection and mobility features of synchronous replication and Live Volume.
Revisions Revisions Date Description May 2014 Merged synchronous replication and Live Volume documents; updated for Enterprise Manager 2014 R2 and SCOS 6.5 July 2014 vSphere HA PDL update November 2015 Updated for SCOS 6.7 July 2016 Updated for SCOS 7.1 and DSM 2016 R2 October 2016 Minor updates February 2017 Updated guidance on MPIO settings for Windows Server and Hyper-V July 2017 Minor updates December 2017 Minor updates to section 3.4.1 April 2018 Updated for SCOS 7.
Table of contents Table of contents Revisions.............................................................................................................................................................................2 Acknowledgments ...............................................................................................................................................................2 Table of contents ................................................................................................
Table of contents 8 VMware vSphere and Live Volume ............................................................................................................................49 8.1 Path Selection Policies (PSP) ..........................................................................................................................49 8.2 Round Robin with Live Volume ALUA ..............................................................................................................49 8.3 Fixed ........
Table of contents 10.7 Use cases .........................................................................................................................................................93 11 Live Volume use cases ..............................................................................................................................................99 11.1 Zero-downtime SAN maintenance and data migration.....................................................................................99 11.
Executive summary Executive summary Preventing the loss of data or transactions requires a reliable method of continuous data protection. In the event of a disaster or unplanned outage, applications and services must be made available at an alternate site as quickly as possible. A variety of data mobility methods, including asynchronous replication, can accomplish the task of providing offsite replicas.
Introduction to synchronous replication 1 Introduction to synchronous replication While SC Series storage supports both asynchronous and synchronous replication, this document focuses primarily on synchronous replication. By definition, synchronous replication ensures data is written and committed to both the replication source and destination volumes in real time. The data is essentially written to both locations simultaneously.
Introduction to synchronous replication 1.2.3 Licensing Replication licensing, which includes synchronous replication and asynchronous replication, is required for each SC Series array participating in volume replication. Additionally, a Live Volume license for each array is required for all Live Volume features. With Dell EMC SC All-Flash storage arrays such as the SC5020F and SC7020F, the replication and Live Volume licensing are included. 1.2.
Data replication primer 2 Data replication primer Data replication is one of many options that exist to provide data protection and availability. The practice of replication evolved out of a necessity to address a number of matters such as substantial data growth, shrinking backup windows, more resilient and efficient disaster recovery solutions, high availability, mobility, globalization, cloud, and regulatory requirements.
Data replication primer Figure 1 demonstrates the write I/O pattern sequence with synchronous replication: 1. 2. 3. 4. 5. 6. The application or server sends a write request to the source volume. The write I/O is mirrored to the destination volume. The mirrored write I/O is committed to the destination volume. The write commit at the destination is acknowledged back to the source. The write I/O is committed to the source volume. Finally, the write acknowledgement is sent to the application or server.
Data replication primer Figure 2 demonstrates the write I/O pattern sequence with respect to asynchronous replication. 1. The application or server sends a write request to the source volume. 2. The write I/O is committed to the source volume. 3. Finally, the write acknowledgement is sent to the application or server. The process is repeated for each write I/O requested by the application or server. 4.
Data replication primer Figure 3 demonstrates the write I/O pattern sequence with semi-synchronous replication. 1. The application or server sends a write request to the source volume. 2. The write I/O is committed to the source volume. 3. The write acknowledgement is sent to the application or server. The process is repeated for each write I/O requested by the application or server. For each write I/O that completes that process, there is an independent and parallel process: a.
Synchronous replication features 3 Synchronous replication features SC Series storage supports a wide variety of replication features. Each feature is outlined in the following sections. 3.1 Modes of operation A number of evolutionary improvements have been made to enhance synchronous replication with SC Series arrays. Among these improvements are choice in replication mode on a per-volume basis. Synchronous replication can be configured in one of two modes: high consistency or high availability. 3.1.
Synchronous replication features threshold, journaled I/O at the source volume is flushed to the destination volume where it will be committed. During this process, incoming application writes continue to be written to the journal. After all journaled data is committed to the destination volume, the source and destination will be in sync and the data on both volumes will be consistent.
Synchronous replication features exposure as well as the replication link bandwidth consumed to recover. Minimal recopy is also employed in high consistency mode should the destination volume become unavailable during initial synchronization or an administrator invoked a pause operation on the replication. Flushing journaled writes to the destination volume to regain volume consistency 3.
Synchronous replication features 3.4 Multiple replication topologies Dell extends synchronous replication support beyond just a pair of SC Series volumes residing in the same or different sites. A choice of two topologies or a hybrid combination of both is available. 3.4.1 Mixed topology The mixed topology, also known as 1-to-N (N=2 as of SCOS 6.
Synchronous replication features After DR activation, a replica volume can be replicated to another replica with efficiency 3.4.2 Cascade topology The cascade topology allows asynchronous replications to be chained to synchronous or asynchronous replication destination volumes. This topology is useful in providing immediate reprotection for a recovery site. Similar to the mixed topology, it provides a flexible choice of locations for data recovery or business continuation practices.
Synchronous replication features newer, Live Volume is designed to work in conjunction with asynchronous and synchronous replication types. In addition, Live Volume supports many of the current synchronous replication features such as modes of operation and mode migration. 3.5.1 Preserve Live Volume In SCOS 6.5 or newer, recovering data from a secondary Live Volume, when the primary Live Volume is unavailable, is faster, easier, and more flexible.
Synchronous replication features replica. When using high consistency synchronous replication, data between source and destination must be consistent for DSM to advise it is safe to use the destination replica for recovery. When using high availability synchronous replication (or high consistency with the ability to pause replication), the data between source and destination volumes may or may not be consistent depending on whether the replication was in sync or out of date at the time of the failure.
Synchronous replication use cases 4 Synchronous replication use cases Replicating data can be a valuable and useful tool, but replication by itself serves no purpose without tying it to a use case to meet business goals. The following sections highlight sample use cases for synchronous replication. 4.
Synchronous replication use cases High consistency synchronous with consolidated vSphere or Hyper-V sites A replication link or destination volume issue in St. Paul results in a VM outage in Minneapolis 4.2.2 Microsoft SQL Server and Oracle Database/Oracle RAC Database servers and clusters in critical environments are often designed to provide highly available, large throughput, and low latency access to data for application tier servers and sometimes directly to application developers or end users.
Synchronous replication use cases A replication link or destination volume issue in St. Paul results in database outage in Minneapolis To summarize, there are high consistency use cases that can be integrated with virtualization as well as database platforms. The key benefit being provided is data consistency and zero transaction loss. Keep in mind that the infrastructure supporting synchronous replication between sites must perform sufficiently.
Synchronous replication use cases asynchronous replication. Adding significant distance between sites generally results in latency increases which will still be observed in the applications at the source side for as long as the high availability replication is in sync. Finally, if virtual machines are deployed in a configuration that spans multiple volumes, consider using Replay Manager or consistency groups. Replay Manager is covered in section 11.6.
Synchronous replication use cases accomplished with Replay Manager (especially recommended for Microsoft products through VSS integration) or by containerizing volumes by use of consistency groups. To create consistency across snapshots using consistency groups, a snapshot profile is created with a snapshot Creation Method of Consistent (Figure 18). This profile is then applied to all volumes containing the dataset.
Synchronous replication use cases high availability mode recovery, disaster recovery, or remote replicas which will be discussed in the coming sections. High availability synchronous with databases A replication link or destination volume issue in St. Paul results in no database outage in Minneapolis 4.4 Remote database replicas One practice commonly found in organizations with Microsoft SQL Server or Oracle database technologies is to create copies of databases.
Synchronous replication use cases Database replicas distributed in a mixed topology 4.5 Disaster recovery With data footprints growing exponentially, backup and maintenance windows shrinking along with the cost of storage, and the impact of downtime gnawing on the conscience of businesses, migrating to online storagebased data protection strategies is trending for a variety of organizations.
Synchronous replication use cases infrastructure (this includes network, fabric, and storage) can scale to support the amount of data being replicated and the rate of change. 4.5.1 Hyper-V and VMware As discussed in previous sections, replicating the file objects that form the construct of a virtual machine takes advantage of the intrinsic encapsulation and portability attributes of a VM.
Synchronous replication use cases going to satisfy a 24-hour RTO. Data growth on tapes means that there is a growing number of sequentialaccess, long-seek time tapes for restoration. This diminishes the chances of meeting RTO, and increases the chances that one bad tape will cause data recovery to fail. Data replication is a major player in meeting RTO. Intra-volume consistency is extremely important in a distributed virtual machine disk or database volume architecture.
Synchronous replication use cases DSM bundles DR automation tools to help meet RTO requirements For virtualization and database use cases alike, Unisphere Central is used to create asynchronous or synchronous replications. These replications predefine the destination volumes that will be presented to storage hosts for disaster recovery. Note: Destination replica volumes are for DR purposes only and should not be used actively in a Microsoft or VMware vSphere Metro Storage Cluster design.
Synchronous replication use cases resync login accounts). For VMware vSphere and Hyper-V hosts, VM datastores are now visible to the hosts and VMs need to be added to the inventory so that they can be allocated as compute and storage resources by the hypervisor and then powered on. In Hyper-V 2008 R2, the configuration file for each virtual machine must be generated with the correct number of processors, memory, network card, and attached virtual machine disk files.
Synchronous replication use cases VMware vSphere Site Recovery Manager and SC Series active/active architecture The DSM server must be available to perform DR testing or an actual DR cutover for an automated DR solution involving SC Series storage replication. This means making sure that at least one DSM server resides at the recovery site so that it can be engaged when needed for DR plan execution.
Synchronous replication use cases 4.5.4 Topologies and modes Replication of data between or within data centers is the fastest, most efficient, and automated method for moving large amounts of data in order to provide replica data, data protection, and business continuation in the event of a disaster. This section provides examples of the different topologies and modes available with synchronous replication in addition to appropriate uses of asynchronous replication.
Synchronous replication use cases Intra campus and metro DR sites, cascade topology, Fibre Channel and iSCSI replication Intra-campus, metro, and remote DR sites; hybrid topology; Fibre Channel and iSCSI replication The examples in Figure 26 through Figure 30 serve to represent physical hosts or virtual machines.
Live Volume overview 5 Live Volume overview Live Volume is a high availability and data mobility feature for SC Series storage that builds on the Dell Fluid Data™ architecture. It enables non-disruptive data access, data migration, and stretched volumes between SC Series arrays. It also provides the storage architecture and automatic failover required for VMware vSphere Metro Storage Cluster certification (vMSC) with SCOS 6.7 and newer. SCOS 7.
Live Volume overview • Supports asynchronous or synchronous replication and included features such as: - • • • Snapshots High consistency and high availability modes Mode migration DR activation for Live Volume managed replications Supports an additional asynchronous or synchronous Live Volume replication to a third array created and dynamically managed by Live Volume Provides automatic or manual Live Volume failover and restore in the event of an unplanned outage at a primary Live Volume site Includes
Live Volume overview Creating a Live Volume 5.2 Proxy data access An SC Series Live Volume is comprised of a pair of replication enabled volumes: a primary Live Volume and a secondary Live Volume. A Live Volume can be accessed through either array supporting the Live Volume replication. However, the primary Live Volume role can only be active on one of the available arrays. All read and write activity for a Live Volume is serviced by the array hosting the primary Live Volume.
Live Volume overview In Figure 33, a mapped server is accessing a Live Volume by proxy access through the secondary Live Volume system to the primary Live Volume system. This type of proxy data access requires the replication link between the two arrays to have enough bandwidth and minimum latency to support the I/O operations and latency requirements of the application data access. Proxy data access through the Secondary Live Volume 5.
Live Volume overview Uniform Live Volume with ALUA and Round Robin PSP ALUA can be enabled on Live Volumes when both SC Series arrays are running SCOS 7.3 or newer. For new Live Volumes, select the Live Volume option to enable Report Non-optimized Paths.
Live Volume overview For pre-existing Live Volumes, once both SC Series arrays are upgraded to SCOS 7.3, a banner will be displayed allowing these Live Volumes to upgraded to support ALUA capability. Banner in DSM 2018 showing Live Volumes can be upgraded for ALUA capability Dell Storage Manager 2018 and newer provides guidance through the upgrade process.
Live Volume overview After the Live Volume is upgraded for ALUA capability, the last step is to choose whether or not to Report Non-optimized Paths. The default action is to report non-optimized paths. However, if the storage host operating system does not support ALUA but there is a desire to upgrade the Live Volume for ALUA support, this option allows leaving the ALUA feature disabled from the viewpoint of the storage host operating system.
Live Volume overview It is recommended to use dedicated VLANs or fabrics to isolate IP-based storage traffic from other types of general-purpose LAN traffic, especially when spanning data centers. While this is not a requirement for Live Volume, it is a general best practice for IP-based storage.
Live Volume overview 5.5 Replication and Live Volume attributes Once a Live Volume is created, additional attributes can be viewed and modified in Unisphere Central for SC Series as depicted in Figure 40. Live Volume settings 5.5.1 Replication Live Volume is built on standard SC Series storage replicated volumes in which each replicated volume can individually be configured as asynchronous, synchronous high availability, or synchronous high consistency.
Live Volume overview present on the connection, Dell Storage recommends disabling Deduplication for Live Volumes in order to preserve controller CPU time for other processes. Replicate Active Snapshot: It is recommended that Replicate Active Snapshot is enabled for asynchronous Live Volumes. This ensures that data is replicated in real time as quickly as possible which decreases the amount of time required to perform a Live Volume Swap role.
Live Volume overview Editing a Live Volume with QoS nodes As a best practice, common QoS Nodes should not be shared between a single SC Series source and multiple SC Series destinations, particularly where Live Volume managed replications are in use.
Live Volume overview The autoswap design is meant to make intelligent decisions on the autoswap movement of Live Volume primary systems while preventing role swap movement from occurring rapidly back and forth between arrays. Min Amount Before Swap: This attribute is the amount of data accessed from a secondary system. If there is light, infrequent access to a Live Volume from a secondary array, consider if this makes sense to move the primary to that system. If so, set this value to a very small value.
Data Progression and Live Volume 6 Data Progression and Live Volume Data Progression lifecycles are storage profiles that are managed independently on each SC Series array configured with Live Volume. If a Live Volume is not replicated to the lowest tier on the destination array, data ingestion follows the volume-based storage profile. The Data Progression lifecycle on the destination array then moves data based on the destination storage profile.
Live Volume and MPIO 7 Live Volume and MPIO By using Live Volume with a storage host that has access to both SC Series arrays in a Live Volume configuration, multiple paths can be presented to the server across the arrays. All read and write I/O ultimately flows to the primary Live Volume. Read and write I/O sent down paths to the primary Live Volume are handled directly. Read and write I/O sent down paths to the secondary Live Volume are proxied to the primary Live Volume via replication ports.
Live Volume and MPIO swap would be preferred with non-uniform storage presentation so that a role swap will automatically follow the vMotion of virtual machines between sites. Regardless of storage presentation, MPIO path selection, and role swap policies, synchronous replication latency between arrays will impact applications.
VMware vSphere and Live Volume 8 VMware vSphere and Live Volume When vSphere and Live Volume combine, they can provide VMware-certified, large-scale application and service mobility, high availability, planned maintenance, resource balancing, disaster avoidance, and disaster recovery options for virtual environments. 8.
VMware vSphere and Live Volume Note: ALUA information used by the Round Robin PSP for a device can also be viewed using esxcli as shown in the following example. TPG_state identifies the active/optimal and active/non-optimal ALUA state for each target port group. Working Paths identifies each active/optimal path for Round Robin as derived from the target port groups. [root@s1212:~] esxcli storage nmp device list -d naa.6000d31000ed1f000000000000000015 naa.
VMware vSphere and Live Volume For customers concerned about this behavior occurring in their environments, there are two workarounds: • • After the Live Volume role swap, perform a vSphere storage rescan. Reduce the vSphere host advanced setting Disk.PathEvalTime from the default of 300 seconds down to an acceptable automatic storage rescan interval. This would need to be performed on each vSphere host the Live Volume is mapped to. Example: Reducing the Disk.
VMware vSphere and Live Volume 8.3 Fixed The Fixed PSP may be desirable when a preferred path on the storage fabric should be used. The Fixed PSP may also be useful in a uniform Live Volume storage presentation, if the ALUA feature (added in SCOS 7.3) is not yet available, in order to avoid sending read and write I/O down a non-optimal path to the secondary Live Volume.
VMware vSphere and Live Volume As depicted in Figure 45, a Live Volume replication exists between SC Series A and SC Series B. Two vSphere hosts are mapped to the Live Volume on each array. The Primary Live Volume is located on SC Series A, and the Fixed preferred path (Figure 44) on each vSphere host is configured to use a path to SC Series A as the preferred path.
VMware vSphere and Live Volume In this configuration, the MPIO policy for the primary Live Volume can be configured as either Round Robin (preferred) or Fixed. Enabling Live Volume automatic role swap would be optimal in this configuration where vMotion is used to migrate virtual machines between sites (for example, from Site A to Site B).
VMware vSphere and Live Volume 8.8 Live Volume automatic failover Configuring Live Volume for synchronous high availability replication and automatic failover (Figure 47) is a key requirement for vMetro Storage Cluster deployments. Live Volume and replication may be configured on a per-volume basis using the SC Series vSphere web client plug-in or Unisphere Central for SC Series. The Failover Automatically feature for Live Volume is configured using DSM.
VMware vSphere and Live Volume 8.9.1 vMSC uniform storage presentation Uniform presentation (Figure 48) tends to be a more common design, which provides host-to-Live-Volume storage paths through both arrays. This means that half of the I/O will go through the primary Live Volume and the other half with be proxied through the replication link through the secondary Live Volume.
VMware vSphere and Live Volume 8.9.2 vMSC non-uniform storage presentation Non-uniform means that each vSphere Metro Storage Cluster host will access a Live Volume through one array or the other, but not both. Each cluster node has read/write access to either the primary or the secondary Live Volume, but not both simultaneously. For the cluster nodes with access to the primary Live Volume, their front-end I/O remains local in proximity.
VMware vSphere and Live Volume 8.10 Tiebreaker service A tiebreaker service is built into the Dell Storage Manager Data Collector as well as the Remote Data Collector, and acts as a quorum to facilitate Live Volume automatic failover. It also plays an important role in preventing split-brain conditions should a network or fabric partition between arrays occur. The tiebreaker should be located at a site physically independent of arrays participating in a Metro Cluster.
VMware vSphere and Live Volume 8.11.2 Secondary Live Volume array failure If an unplanned event impacts an SC Series array hosting a secondary Live Volume, the primary Live Volume remains available on the surviving array. Secondary Live Volume failure 8.11.3 Replication network partition If an unplanned event impacts the replication link between SC Series arrays, the primary Live Volume will continue to remain available serving I/O and no automatic failover will occur.
VMware vSphere and Live Volume 8.11.4 SC Series back-end outage If an unplanned event impacts primary Live Volume back-end SC Series components (such as the loss of several drives or a drive shelf), the primary Live Volume will automatically fail over to the surviving array. Both arrays will continue to service I/O requests from the primary Live Volume locally and remotely through the replication link. Primary back-end failure 8.11.
VMware vSphere and Live Volume 8.11.6 Tiebreaker service link failure Similar to the previous example, if either SC Series array loses network connectivity to the tiebreaker service on the DSM server, both arrays will continue to service I/O requests but no automatic failover can occur during this time. Tiebreaker link failure 8.12 Detailed failure scenarios The following table outlines tested design and component failure scenarios with Live Volume automatic failover enabled with vSphere HA.
VMware vSphere and Live Volume Event scenario Live Volume behavior vSphere HA behavior Non-uniform: SC Series controller-pair outage takes down primary Live Volumes Primary Live Volumes automatically recovered at remote site vSphere HA restarts impacted VMs at remote site Non-uniform: SC Series controller-pair outage takes down secondary Live Volumes Primary Live Volumes remain available at remote site vSphere HA restarts impacted VMs at remote site Uniform: SC Series back-end outage (such as loss
VMware vSphere and Live Volume Event scenario Live Volume behavior vSphere HA behavior Tiebreaker service failure No automatic failover None – VMs remain running at both sites Primary Live Volumes remain available Secondary Live Volumes remain available Tiebreaker service network No auto failover isolated from either or both array sites Primary Live Volumes remain available None – VMs remain running at both sites Secondary Live Volumes remain available 8.
VMware vSphere and Live Volume Live Volume configured with Restore Automatically enabled 8.14 VMware DRS/HA and Live Volume VMware Distributed Resource Scheduler (DRS) is a cluster-centric configuration that uses VMware vSphere vMotion® to automatically move virtual-machine compute resources to other nodes in a cluster. This is performed without local, metro, or stretched-site awareness. In addition, vSphere is unaware of storage virtualization that occurs in Live Volume.
VMware vSphere and Live Volume • can then be performed at a containerized-group level rather than at an individual virtual-machine level. Host groups: Hosts that share a common site can be placed into host groups. Once the host groups are configured, they can represent locality for the primary Live Volume. VM groups can be assigned to host groups using the DRS Groups Manager. This will ensure all virtual machines that share a common Live Volume datastore are consistently running from the same datastore.
VMware vSphere and Live Volume The following vSphere advanced tuning should be configured for non-uniform stretched cluster configurations. This tuning allows HA to power off and migrate virtual machines during storage-related availability events after the primary Live Volume becomes available again using the Preserve Live Volume or Live Volume Failover Automatically feature. Note: The Live Volume Failover Automatically feature is only supported with vSphere 5.5 and newer.
VMware vSphere and Live Volume Configuring vSphere HA for PDL and APD conditions The Disk.AutoremoveOnPDL advanced setting is not configurable in VMCP and should remain at its default value of 1 for each vSphere 6 host in the cluster. For more information on the Disk.AutoremoveOnPDL feature, refer to VMware KB article 2059622, PDL AutoRemove feature in vSphere 5.5 and vSphere 6.0.
VMware vSphere and Live Volume • • • • • • • • • • • 8.16 VMware vSphere 5.5 or newer SC Series best practices with VMware vSphere followed and configured Uniform or non-uniform storage presentation of volumes to vSphere hosts Fixed or Round Robin path selection policy (PSP) Live Volume with automatic failover with VMFS datastores or virtual mode raw device mappings (RDMs), with support for physical mode RDMs (added with SCOS 7.
VMware vSphere and Live Volume One differentiator between standard replication and Live Volume managed replication is that the source volume of a managed replication is dynamic and changes as Live Volume role swaps occur. The easiest way to think of this is to understand that the managed replication source is always the primary Live Volume.
Live Volume support for Microsoft Windows/Hyper-V 9 Live Volume support for Microsoft Windows/Hyper-V This section details aspects of the Live Volume feature set that are specific to Microsoft environments, such as best practices for MPIO settings and Live Volume automatic failover (LV-AFO), including ALUA support with SCOS 7.3. It is recommended to review the prior general sections of this document before proceeding in this section.
Live Volume support for Microsoft Windows/Hyper-V 9.3 Round Robin with Subset (ALUA) Round Robin with Subset (ALUA) is supported when a Live Volume is configured to report non-optimized paths to server hosts. This ability is available with SCOS 7.3 and newer, along with DSM 2018 R1 and newer. ALUA support is enabled by default when configuring a new Live Volume with SCOS 7.3.
Live Volume support for Microsoft Windows/Hyper-V 9.4 Windows Server support limitations with Live Volume ALUA Live Volume ALUA support with SCOS 7.3 and DSM 2018 R1 has use case limitations to be aware of. In the process of developing this feature, it was found that the Microsoft implementation of MPIO incorrectly utilizes non-optimal paths when a stretch cluster node is configured to use non-uniform server mappings to a secondary Live Volume.
Live Volume support for Microsoft Windows/Hyper-V 9.5 Failover Only With this MPIO policy, a primary data path is configured as active/optimized, and all other paths are set to standby. This policy is an option when using uniform server mappings and secondary data paths to the secondary Live Volume have limited bandwidth or higher latency that would negatively affect the workload, and the Windows Server operating system does not fully support Live Volume ALUA.
Live Volume support for Microsoft Windows/Hyper-V 9.6 Uniform server mappings with Live Volume and Round Robin In Figure 65, a Round Robin Live Volume MPIO configuration is depicted. In this scenario, the server has two adapters and is mapped to both SC Series arrays that host the primary and secondary Live Volume pair. This configuration is referred to as uniform server mapping since the server is mapped to both SC Series arrays.
Live Volume support for Microsoft Windows/Hyper-V 9.7 Hyper-V and Live Volume The full Live Volume feature set works well with Microsoft Hyper-V in both clustered and non-clustered environments. These Live Volume features include support for synchronous and asynchronous replication, load balancing, disaster avoidance, managed replication, pre-defined DR plans, and other features covered in this document.
Live Volume support for Microsoft Windows/Hyper-V If Live Volume automatic role swap is enabled, the VM placement and workload determines which array will own the primary Live Volume, since the Live Volume will follow the workload, based on the auto-swap thresholds. The scenario in Figure 66 depicts a virtual machine that is live migrated from Host A to Host B.
Live Volume support for Microsoft Windows/Hyper-V One of the best practices with CSVs is utilizing the ability to control which node is the CSV owner. Set the CSV to be owned by a cluster node that is in the primary site and mapped directly to the SC Series array. In this way, if the CSV goes into Network Redirected mode, the CSV owner is in the same site and downtime can be eliminated or reduced. Figure 67 depicts a multi-site Hyper-V cluster with Live Volume.
Live Volume support for Microsoft Windows/Hyper-V 9.10.1 SC Series requirements for LV-AFO for Microsoft The SC Series storage requirements to support LV-AFO for Microsoft are as follows: • A pair of SC Series arrays that support Live Volume (SCv2000 Series is not supported) - • • • • • Replication and Live Volume feature licenses applied to both SC Series arrays SCOS 7.
Live Volume support for Microsoft Windows/Hyper-V Note: Each customer needs to make an informed choice about how they configure the quorum witness for a clustered environment that utilizes LV-AFO. The risk of an outage due to a temporary loss of quorum due to a less resilient design might be permissible in some cases, such as for test or development environments.
Live Volume support for Microsoft Windows/Hyper-V 9.10.3 Enabling LV-AFO for a Live Volume To enable LV-AFO, edit the settings for a Live Volume. Under Replication Attributes, make sure the Type is set to Synchronous and Sync Mode is set to High Availability. Once the correct sync mode is set, the Failover Automatically and Restore Automatically options become available at the bottom of the configuration screen.
Live Volume support for Microsoft Windows/Hyper-V 9.10.4 DR failure scenarios Refer to section 8 for a list of DR events and how LV-AFO will function to protect the environment. The DR events that will cause a Live Volume to automatically fail over are platform agnostic; LV-AFO itself works the same given VMware or Windows Server/Hyper-V hosts or clusters.
Live Volume support for Microsoft Windows/Hyper-V 9.11 Live Volume with SQL Server While SQL Server database files can be stored on Live Volumes, it is important to understand how a primary volume failure can impact database availability. If a database cannot be recovered after a primary volume failure, from either a frozen snapshot or the active snapshot on the secondary volume, a restore from backup will be required.
Live Volume with Linux/UNIX 10 Live Volume with Linux/UNIX Synchronous replication is a feature of SC Series that allows two copies of data to be maintained on separate SC Series arrays using one or multiple replication relationships. These two arrays can be in the same location, or can be geographically dispersed and connected by a WAN.
Live Volume with Linux/UNIX An LVMR volume (in asynchronous mode with active snapshot/Replay enabled, or in synchronous mode) can be used to manage and provide an offsite copy of business-critical data for disaster recovery or data distribution use cases (where locating data closer to the audience could significantly reduce data access latency). Figure 69 depicts this scenario. Live Volume with managed replication 10.
Live Volume with Linux/UNIX 10.4.1 DM-Multipath configuration DM-Multipath comes with built-in default settings for SC Series arrays. These default settings are not reflected in the /etc/multipath.conf file. To display these settings, execute multipathd -k"show config" and look for the device section for "COMPELNT". It is recommended to examine the settings closely especially when Live Volumes are involved because the built-in default settings might not be appropriate.
Live Volume with Linux/UNIX • • 10.5 While device path_selector policies service-time and round-robin are supported by DM-Multipath, Dell Technologies has performed extensive testing with round-robin policy only. See Figure 65. In a uniform configuration where primary and secondary paths are mapped to the Linux host, these paths are grouped into a single group and the round-robin policy spreads I/O to these paths with equal weight even though the secondary paths might have higher latency.
Live Volume with Linux/UNIX 10.5.2 Existing Live Volumes created before SCOS 7.3 For existing Live Volumes created before SCOS 7.3, ensure that these volumes have an ALUA Optimized status of No in DSM and are excluded from being upgraded to ALUA-optimized. After SCOS is upgraded to 7.3, DSM displays a banner in the Live Volume tab indicating that there are existing Live Volumes that can be optimized for ALUA. Select any Live Volume on the list to see the existing ALUA Optimized status. See Figure 70.
Live Volume with Linux/UNIX Unselect Linux Live Volumes from the update list In the event that Linux Live Volumes created before SCOS 7.3 are optimized, the Linux hosts might experience temporary I/O interruption during the optimization process. The length of the interruption varies depending on the host's DM-Multipath configuration. Internal testing showed this can range from a few seconds to a couple minutes. 10.5.3 Disable reporting ALUA optimization As discussed in section 10.
Live Volume with Linux/UNIX Disable ALUA status reporting 10.5.4 Creating new Live Volumes on SCOS 7.3 All new Live Volumes created on SCOS 7.3 are ALUA-optimized. If the Live Volumes are intended for the Linux platform, ensure the Report Non-optimized Paths setting is unchecked during the creation of the Live Volume, or follow section 10.5.3 to update the setting after the Live Volume is created.
Live Volume with Linux/UNIX 10.6 Identify parent SC Series arrays for Linux storage paths This section provides information on how to correlate the Linux device paths to the SC Series arrays. The following exercise demonstrates the process to show the Fibre Channel ports on the SC Series arrays associated with each Linux device paths. The same process also applies to iSCSI transport. The information might be helpful for troubleshooting path failures or performance related issues.
Live Volume with Linux/UNIX c) In DSM, navigate to the Fibre Channel fault domain tabs. If the array is configured in Virtual Port mode, go to the Virtual Ports screen. In this example, the Virtual Ports information for both primary and secondary SC Series arrays are examined.
Live Volume with Linux/UNIX 92 Dell EMC SC Series: Synchronous Replication and Live Volume | CML1064
Live Volume with Linux/UNIX d) The following relationships can be established after analyzing the information from steps a, b, and c. Linux multipath device mpatha consists of eight storage paths that span across SC 22 and SC 23. 10.
Live Volume with Linux/UNIX A volume (or multiple volumes) is first created and mapped to a Linux host. The volumes are scanned, identified and brought into multipath awareness as shown in the following example.
Live Volume with Linux/UNIX |- 1:0:12:1 sdm 8:192 active ready running |- 0:0:5:1 sdn 8:208 active ready running |- 0:0:8:1 sdp 8:240 active ready running |- 1:0:4:1 sdt 65:48 active ready running `- 1:0:8:1 sdv 65:80 active ready running vol_00 (36000d31000fba6000000000000000013) dm-3 COMPELNT,Compellent Vol size=10G features='1 queue_if_no_path' hwhandler='0' wp=rw `-+- policy='round-robin 0' prio=1 status=active |- 1:0:0:1 sdc 8:32 active ready running |- 0:0:0:1 sdb 8:16 active ready running |- 0:0:2:1
Live Volume with Linux/UNIX 10.7.2 Multi-site In this multi-site scenario, the Linux hosts and SC Series arrays are geographically dispersed within a metropolitan (or multi-state) region; sites may be connected with MAN or WAN technologies. Figure 74 depicts this scenario. It should be noted that this scenario can also be scaled down and applied towards single-site deployments as well.
Live Volume with Linux/UNIX reconstruct or import the storage domain into the alternate data center object, and from which object, reconstruct the virtual machine workgroup. The following technical and performance considerations should be taken when exploring these use cases. /etc/multipath.conf: This keeps the contents of /etc/multipath.
Live Volume with Linux/UNIX In this final demonstration, the replication link is quickly changed from synchronous replication to asynchronous replication and write I/O requests are applied to the primary Live Volume. Note the reduction in time required to commit these writes compared to the previous use of synchronous replication. [tssrv216:/root]# cd /etc; time tar cvf - . | (cd /vol_00; tar xvf -) [snip] real user sys 10.7.3 0m18.266s 0m0.104s 0m1.
Live Volume use cases 11 Live Volume use cases This section exhibits additional examples of how Live Volume can be used in a variety of environments. Live Volume is not limited to these use cases. 11.1 Zero-downtime SAN maintenance and data migration With Live Volume, maintenance activities can be performed without downtime on an SC Series array.
Live Volume use cases Applications remain continuously available on SC2 while SC1 undergoes maintenance 11.2 Storage migration for virtual machine migration As VMware, Hyper-V, or XenServer virtual machines are migrated from data center to data center, Live Volume can automatically migrate the related volumes to optimize performance and minimize I/O network overhead.
Live Volume use cases The requirements for this operation include the following: • • • • 11.
Live Volume use cases 11.4 On-demand load distribution In this use case, Live Volume transparently distributes the workload, balances storage utilization, or balances I/O traffic between two SC Series arrays. Configuration: SC Series arrays must be connected using high-bandwidth and low-latency connections, especially when synchronous replication is used with Live Volume. Operation: In an on-demand, operator-driven process, Live Volume can transparently move volumes from one SC Series array to another.
Live Volume use cases Operation: In an on-demand, operator-driven process, Live Volume can transparently move volumes from one SC Series array to another. The applications operate continuously. This enables several options for improved system operation: • • • • Distribution of I/O workload Distribution of storage Distribution of front-end Ioad traffic Reallocation of workload to match capabilities of heterogeneous systems Cloud computing 11.
Technical support and additional resources A Technical support and additional resources Dell.com/support is focused on meeting customer needs with proven services and support. Storage solutions technical documents and videos provide expertise that helps to ensure customer success on Dell EMC storage platforms. A.