PS Series Asynchronous Replication Best Practices and Sizing Guide Dell EMC Engineering November 2016 A Dell EMC Best Practices Guide
Revisions Date Description August 2013 Initial release November 2016 Added updates for delegated space in multiple pools as well as a sizing example Acknowledgements Updated by: Chuck Farah The information in this publication is provided “as is.” Dell Inc. makes no representations or warranties of any kind with respect to the information in this publication, and specifically disclaims implied warranties of merchantability or fitness for a particular purpose.
Table of contents 1 2 3 1.1 Audience .............................................................................................................................................................5 1.2 Key benefits of using PS Series asynchronous replication ................................................................................5 Asynchronous replication overview ..............................................................................................................................7 2.
7.5.1 Primary group space considerations ................................................................................................................32 7.5.2 Secondary group considerations ......................................................................................................................39 7.5.3 Initial replication and subsequent replication cycles .........................................................................................41 A 7.
1 Introduction This white paper describes the Dell EMC PS Series asynchronous replication feature and presents lab validated best practices to help IT and SAN administrators understand and fully utilize the powerful set of volume data replication features delivered with every PS Series storage array.
Asynchronous replication addresses varying needs using powerful features and configuration flexibility: Multiple recovery points are efficiently stored. Per-volume replication schedules permit varying service levels. You can fast failback to the primary site by synchronizing only the data that has changed while the secondary site was in use. One-way, reciprocal, one-to-many, or many-to-one replication paths are possible. Thin replicas provide space efficiency.
2 Asynchronous replication overview PS Series asynchronous replication is used to replicate volumes between different groups as a way to protect against data loss. The two groups must be connected through a TCP/IP-based network. This means that the physical distance between the groups is not limited. The replication partner group can be located in the same data center, or it can be in a remote location.
Term Description Secondary group A group containing the replica or copy of the source volume(s). Destination group Same as secondary group. Delegated space The amount of space on the secondary group that is delegated to a replication partner, to be reserved for retaining replicas. Replica reserve The space allocated from delegated space in the secondary group to store the volume replica set for a specific volume.
2.3 Local reserve, delegated space and replica reserve Local reserve is a volume-level setting that defines how much space is allocated on the primary group to support replication processing. Delegated space is a group -level setting that defines the total space dedicated to receiving and storing inbound replica sets on the secondary group from a primary group replication partner.
2.4 Replication partnerships With the exception of failback events, PS Series replication is always a one-way process, in which data moves from the primary group volume to the secondary group replica set. Replication also allows for reciprocal partners, which means that you can use two operational sites as recovery sites for each other. Figure 2 shows examples of different replication partnership scenarios.
A volume’s replica set is identified by the volume name with a numbered extension. The number corresponds to the number of replication partners, in the order they were added. For example, an inbound volume from the first replication partner, with the name repl-vol1, will have a replica set with the name repl-vol1.1 as shown in Figure 3. A volume from a second replication partner, also with a source volume named repl-vol2 will have a replica set named repl-vol2.1.
3 PS Series replication process When a replica is created, the first replication process completes the transfer of all volume data. For subsequent replicas, only the data that changed between the start time of the previous replication cycle and the start time of the new replication cycle is transferred to the secondary group. Dedicated volume snapshots are created and deleted in the background as necessary to facilitate the replication process.
3.3 Between replication events (repeating) Once first replication has occurred, the system continues to keep track of volume data changes that occur so that subsequent replication processes can copy those changes to the replica set. This tracking process does not consume additional space.
With replication, it does not matter if the volume is thin provisioned or uses a traditional volume. In either case, only the data that has changed will be copied to the replica. On the secondary side, volumes are always thin provisioned to conserve available capacity used by the replica reserve for that volume. 3.
4 Test topology and architecture To properly design a replication scenario, administrators must understand how the quality of the network connection between groups can affect replication. Also, as discussed previously, when data changes on the source volume, it will be replicated over the network link. The amount of changes occurring will directly affect how long it takes for each replica to complete. To help illustrate these points, asynchronous replication was set up in a lab and test results gathered.
5 Test methodology Because actual WAN links can vary greatly, the Netropy 10G WAN emulator was used to simulate the WAN connection. The Netropy 10G uses dedicated packet processors to ensure precision and repeatability. Besides throttling bandwidth, it can also inject latency, jitter, and packet loss. This allowed simulating the behavior of several different kinds of SAN and WAN links between sites.
6 Test results and analysis This section details the test results and analyzes the data to explain the effects of each parameter on PS Series asynchronous replication. 6.1 Effect of RAID level for the primary and secondary groups This test measured the time it took to replicate a 100 GB volume between primary and secondary sites using the full 10 Gb bandwidth. Then, the RAID level was changed on both primary and secondary groups to compare the effect.
full 10 Gb speed, three times the amount of data is replicated (300 GB compared to 100 GB), but this replication only takes about twice as long for the three-volume configuration. As the available bandwidth across the WAN link is decreased, the amount of time to replicate all three volumes increases to where it is about three times as long. This is because the WAN connection now becomes a limiting factor on the amount of data the network supports.
volume behaves as if it were a standard volume and the contents of the entire volume will be copied during initial replication. 6.4 Theoretical bandwidth of links and replication time Because the TCP/IP protocol carries some overhead, the speed of the network alone provides insufficient information to estimate the time it will take for replication. In a typical SAN environment, Dell EMC recommends using Jumbo Frames (9000 bytes) to get maximum performance.
6.5 Bandwidth effects Asynchronous replication is affected by the speed of the link between replication partners. When replicating within the same data center or between buildings, the available bandwidth may be equal to the full speed of the SAN. In other cases, a slower WAN link may be utilized. Figure 8 shows the effect that network link speed can have on the time it takes to complete replication. Time to complete 100GB replication 9480.00 10000 1000 Minutes 325.07 94.98 100 16.23 10 8.
6.6 Packet loss effects Asynchronous replication is also affected by the quality of the link between replication partners. If a link is dropping packets, the asynchronous replication processes will have to resend those segments. When packets are unacknowledged (lost), TCP/IP protocol invokes an algorithm known as slow start (see RFC 5681, TCP Congestion Control). Slow start is part of a normal congestion control strategy to avoid sending more data than the network or other devices are capable of handling.
6.7 Latency and TCP window size effects For the purpose of this discussion, latency is how long it takes a packet of data to travel from one point to another across the network. Latency is inherent in any network, including an iSCSI-based SAN. Typically, iSCSI SAN latencies are quite small, and usually measured in microseconds or milliseconds.
Effects of latency at OC3 One-way latency (ms) 50 346.13 20 141.52 0 11.6 0 50 100 150 200 250 Replication time (minutes) 300 350 400 10GB data Effect of latency on replication time Figure 10 shows the time it took to replicate 10 GB of data across an OC3 (155 Mbps) WAN link for three different simulated link latencies. The results clearly show a significant impact on the performance of replication across the WAN link.
Effect of TCP window 400 300 Replication time (minutes) 346.1 343.7 343.2 350 276.8 279.1 276.1 250 200 150 138.5 141.5 138.9 115.4 112.3 112.0 100 50 0 10Gb 72K 20ms 1Gb Link speed 72K 50ms 2MB 20ms OC3 2MB 50ms Effects of TCP window Figure 11 shows the effect of increasing the size of the TCP window from 72K to 2MB when replicating across different WAN links speeds with 20ms or 50ms link latency.
6.8 Pool configuration effects The configuration of groups and pools may also have an effect on how quickly replication occurs. For example, the number of array members in a pool affects how many array members a volume may be distributed across. Because of the scale-out architecture of PS Series arrays, a group that contains multiple members has the potential to move data faster than a group with only a single member.
6.9 Server I/O effects Asynchronous replication is a background process, and as such is designed to have little impact on the performance of servers (hosts) connected to the SAN. On the other hand, that means that server I/O can affect the performance of replication. If there is a heavier workload from the attached hosts, the arrays may devote fewer resources to replication, which could cause replication times to be longer than when there is a lighter workload from the attached hosts.
7 Best practices for planning and design 7.1 Recovery time objective (RTO) and recovery point objective (RPO) In simple terms, as it applies to replication and disaster recovery, RTO is how long a business can get by without a particular system or application in the event of a disaster. An RTO of 24 hours implies that after a disaster, the system or data needs to be online and available again within 24 hours.
7.3 Tuning the WAN link When replicating across a WAN, the data packets will probably be traveling through a router. A router generally has a memory buffer that stores incoming packets so that they can be processed and forwarded. If the WAN link is congested (or too small) and it becomes a bottleneck, this can cause incoming packets to fill up the memory buffer on the router. Eventually the router may be forced to discard (drop) incoming packets until it frees up space in the memory buffer.
The system defaults are set conservatively so that replication will continue in all cases, even if the volume contents change completely from one replication to the next. When using thin provisioning, be aware that percentages are based on the internally allocated size of the volume, rather than based on the size of the volume as reported to the server.
Figure 13 shows the remote replicas that are available for a volume as displayed in Group Manager from the primary group. In this case, there are five replicas of a volume on the remote storage system. The number of replicas retained on the remote partner system is determined by the size of the volumes being replicated and the amount of changed data that is replicated, the size of the replica reserve, and the delegated space.
Navigation tip: Group Manager GUI (primary) > Volumes > select volume > Schedule (tab) Replication cycle showing the maximum number of replicas to keep Note: Each replica will contain only the changes between replication cycles. For this example the PrimaryGroup will replicate two volumes to the SecondaryGroup. Both Groups have two pools, this is intentional to demonstrate the improvements with firmware v8 and higher which allows for multiple destination pools from a primary group.
7.5.1 Primary group space considerations The PrimaryGroup is a group with two PS6210 members and two pools (in this example only the volumes in the hybrid pool will be used). From a best practice perspective, sizing to the volume total reported size is a way to ensure that secondary group is sized to accommodate the volume potential growth. The particular needs of the environment may override this general rule, however the overall growth potential should still be considered.
Navigation tip: SAN Headquarters (primary) > Capacity > Pools (menu select pool) Reported Size and In use capacities for the two volumes The local replication reserve should accommodate both the original usable space in the volume and the maximum change rate. In addition, typically the fast-failback snapshot is kept, which will keep the most complete replica to allow the partners to synchronize back only the changes that occurred during disaster recovery (promote to volume on the secondary).
In summary, the PrimaryGroup will need the following: Replication local reserve for all volumes: Total of 1.64 TB or 1680 GB (200 percent of volume reported size). The following details show the replication configuration for each volume.
As a demonstration, repl-vol1 will be configured for replication with the space reservation as indicated in the following screenshot. Both volumes will be configured this way.
After allocating the replication reserve, the PrimaryGroup will be left with 14.92 TB of free space in the hybrid pool where these volumes reside and will be sufficient for replicating the two volumes. Navigation tip: SAN Headquarters (primary) > Capacity > Pools (menu select pool) SAN Headquarters view: PrimaryGroup has 14.92 TB free after allocating replication reserve of 1.
Calculating additional space that may be needed from scheduled replications should take into account the number of replicas that are to be retained. The following example shows several replications occurring on one volume after a schedule is defined. Navigation tip: Group Manager GUI (primary) > Volumes > (select volume) > Replication (tab) Replication history showing the amount of data transfer between scheduled replications.
In addition, during the replication, the amount of space borrowed may be observed in the Group Manager GUI. Navigation tip: Group Manager GUI (primary) > Volumes > (select volume) > Status (tab) Temporary borrowing of data from free space during replication.
Replication borrowing was introduced in firmware v8 and allows for temporary use of available space. For instance, replication may need to use beyond the local reserve by borrowing the space from one of the previous replicas or local snapshots. After the replication is complete the borrowed space will be freed up if no longer needed. Navigation tip: Group Manager GUI (primary) > Volumes > (select volume) > Status (tab) Borrowed space is free after the pending changes are replicated 7.5.
A PS Series array with at least 1.64 TB of usable space should be used as the secondary. The SecondaryGroup mentioned previously has plenty of usable capacity available. Navigation tip: Group Manager GUI (primary) > Group > Storage Pools SecondaryGroup usable space by pool The SecondaryGroup delegated space will be split between pools with 840 GB in the East pool and 840 GB in the West pool. Total delegated space for the SecondaryGroup is 1.64 TB or 1680 GB (200 percent for each volume).
Summary of needed allocation for the secondary group: East pool delegated space: 841.01 GB West pool delegated space: 841.01 GB Total: 1682.02 GB or 1.64 TB Navigation tip: Group Manager GUI (secondary) > Group > Storage Pools (select pool) > Status (tab) Side-by-side view of space allocation by East (default) and West pools 7.5.3 Initial replication and subsequent replication cycles As a simple demonstration, first the repl-vol1 volume with 399.
Navigation tip: Group Manager GUI (primary) > Replication > Volume Replication PrimaryGroup showing the used delegated space on the SecondaryGroup (outbound replication). To demonstrate subsequent replication cycles, after the first replication, an additional 10 GB was added to repl-vol1. Those changes will be replicated on the next replication schedule or with manual replication.
SAN Headquarters provides a good way to review the replication cycles and statistics. Navigation tip: SAN Headquarters (primary) > Outbound Replicas > Volumes (select volume) SAN Headquarters showing the duration, amount of data to transfer (transfer size), and the transfer rate To continue, repl-vol2 will be configured for replication, and will perform the initial replication to the SecondaryGroup.
Once complete, the total delegated space will be allocated. Monitoring the reserve and delegated space will be important for capacity planning. As an additional safeguard, SAN Headquarters may be configured to email alerts based on replication issues.
Delegated space may also be monitored directly within SAN Headquarters as well, in addition borrowed space will be indicated.
From the point of view of the secondary, the inbound replica total replica reserve usage may also help determine when to allocate more delegate space or acquire a new PS Series array. This is the case in the following example in which 100 percent of the reserve is in use. Navigation tip: SAN Headquarters (secondary) > Capacity > Inbound Replicas Total replica reserve on the secondary showing 100% in use.
For practical purposes, PS Series replication will borrow space as needed and efficiently use the space available. An example of when space borrowing is needed may be when additional space is required for the total replica reserve during a replication cycle. This may be observed in the Group Manager GUI. Navigation tip: Group Manager GUI (primary) > Group > Borrowed Space > Remote Delegated Space (tab) Amount of remote delegated borrowed space.
From a best practice perspective, the borrowed space should be temporary in nature. If replications are requiring borrowed space consistently over time, then the delegated space could be increased and that borrowed space will be freed up after the next replication cycle. See section 3.5 for more details. Figure 36 shows the space borrowed returning to 0 MB after adding an additional 302 GB to both pools’ delegated space.
Navigation tip: SAN Headquarters (primary) > Capacity > Outbound Replicas > Volumes (select volume) Monitoring Replication with SAN Headquarters 7.7 Replicating large amounts of data Some of the testing used 100 GB volumes for replication. Some tests replicated multiple 100 GB volumes simultaneously, and others replicated only a single volume to show how this can affect overall replication times. A 100 GB volume is not a huge amount of data in today’s IT environments.
Another option, which in some cases may be quicker and easier, is to connect the secondary storage systems in the local (primary) data center, establish the replication partnership, and do the initial full synchronization over the full-speed, local iSCSI SAN network. After the initial replication is complete, shut down the secondary storage and move it to the secondary or remote site.
A Technical support and resources Dell.com/support is focused on meeting customer needs with proven services and support. Dell TechCenter is an online technical community where IT professionals have access to numerous resources for Dell software, hardware and services. Storage Solutions Technical Documents on Dell TechCenter provide expertise that helps to ensure customer success on Dell EMC Storage platforms. A.