Best Practices Dell EMC SC Series: Disaster Recovery for Microsoft SQL Server Using VMware Site Recovery Manager Abstract This document identifies options available for providing an automated disaster recovery solution for virtualized Microsoft® SQL Server® workloads on Dell EMC™ SC Series storage.
Revisions Revisions Date Description October 2013 Initial release July 2016 DSM, Live Volume, technical review July 2019 Miscellaneous improvements Acknowledgements Authors: Doug Bernhardt, Jason Boche The information in this publication is provided “as is.” Dell Inc. makes no representations or warranties of any kind with respect to the information in this publication, and specifically disclaims implied warranties of merchantability or fitness for a particular purpose.
Table of contents Table of contents Revisions.............................................................................................................................................................................2 Acknowledgements .............................................................................................................................................................2 Table of contents .................................................................................................
Table of contents 4.2.5 Live Volume ......................................................................................................................................................16 4.2.6 Volume replication considerations ....................................................................................................................17 4.3 Configuring Dell Storage Manager ...................................................................................................................18 4.
Executive summary Executive summary Data center consolidation by way of x86 virtualization is a trend which has gained tremendous momentum and offers many benefits. One workload type that is generally considered a virtualization candidate is Microsoft® SQL Server®. Although the physical nature of Microsoft SQL Server is transformed once it is virtualized, the necessity for data protection, retention, and recovery remains.
Additional resources 1 VMware Site Recovery Manager overview Site Recovery Manager (SRM) is a disaster recovery testing, execution, and planned migration product for VMware virtualized data centers. It leverages the power of storage replication and virtual machine mobility to provide automated disaster recovery testing and execution as well as planned migrations of virtual machines between active sites.
Additional resources 1.1.1 Active/DR site design Traditionally, many disaster recovery plans begin with a single active site and a single DR site. The active site represents the production datacenter. The DR site represents compute, network, and storage capacity where a business could rebuild their IT infrastructure and resume operations. The infrastructure at the DR site remains generally unused until a DR plan is tested or executed.
Additional resources 1.1.2 Active/active site design Site Recovery Manager also supports a similar design in which two sites exist, but both are actively providing applications and services which are in scope for a comprehensive DR plan. In this design, each site functions as an active site for production applications as well as a recovery location for the other active site. Active/active site architecture with DSM available at both sites in the event of a disaster 1.
Additional resources a one-hour RPO may be tied to a tier 1 SQL Server application database. This means a maximum of one hour of data may be lost or the executed disaster recovery plan will recover data to a point within one hour or less from the time of the disaster. RPO is improved by increasing the interval at which data is backed up or replicated to the disaster recovery site. 1.
Additional resources 2 Solution components The solutions described in this document incorporate various components from SC Series, array-based replication, and VMware Site Recovery Manager. A combination of these components can be leveraged to provide a purpose-built solution meeting the data protection and disaster recovery requirements of the environment. 2.
Additional resources automatically through replication at scheduled or continuous intervals. Replication is a significant key to meeting aggressive RTO and RPO in a disaster recovery strategy and serves as the fundamental cornerstone for VMware SRM operations. Various methods of replication exist and will be discussed in further detail. 2.
Additional resources 3 Storage infrastructure Storage is required by vSphere to maintain encapsulated virtual machines and the data each of the VMs contain. vSphere-certified storage is presented to a cluster of vSphere hosts and abstracted by vSphere in a few different ways in order to meet the needs of the VMs. Outside of the disaster recovery context, storage plays major roles in availability, performance, and capacity.
Additional resources or application integration available due to lack of visibility to this disk by the hypervisor. The focus of this paper is traditional .vmdk and RDM virtual disk types for Microsoft SQL Server virtual machines.
Additional resources 4 Solution design SRM can be configured to use either vSphere Replication (VR) or array-based replication. This paper will only cover using SRM with array-based replication, specifically SC Series replication. Before creating the recovery plan in SRM, the type of replication, the method used to create snapshots, and the frequency for which snapshots are taken need to be selected.
Additional resources 4.1.2 Using Replay Manager with vRDMs The VMware Virtual Machines extension can be used to create application consistent snapshots of SQL Server data stored on vRDMs. This backup set will back up all virtual disks (.vmdks) and vRDMs used by the selected virtual machines. If there are other virtual machines using any of the same datastores, it is recommended to include those virtual machines in the same backup set as well. 4.1.
Additional resources instantiated hourly for that volume. The length of time the replication will take depends on the size of the snapshot, which is dictated by the rate of change on the volume, and the bandwidth between the source and target systems. In the given example, assuming the replication bandwidth is sufficient, a one-hour RPO is established for the applications and data on the volume.
Additional resources SRM version 6.1 support for stretched storage with Live Volume has been added in DSM 2016 R1. Supported configurations are asynchronous replication or synchronous high availability replication with non-uniform storage mapping to hosts. For more information on use cases and integrating stretched storage with SRM, see the SRM Administration documentation available at VMware Documentation.
Additional resources 4.3 Configuring Dell Storage Manager Dell Storage Manager is a required component for SRM. It must be up and available in the recovery site in order for SRM to be able to carry out the automated failover workflow when the primary site goes down. In an active/active site configuration, a DSM Data Collector is required at both sites, with the primary Data Collector in one site and a remote Data Collector in the other. The primary or remote Data Collector can be at either site.
Additional resources using the latest frozen snapshot. Consider the following before using the active snapshot with asynchronous replication: • • 4.4.2 Writes are queued up to be replicated in write order. However, if replication gets behind, it can consolidate multiple writes to the same logical block address (LBA) so that only the latest version of the LBA is sent. This type of write consolidation can prevent the successful recovery of SQL Server databases.
Additional resources # Connect to vCenter Connect-VIServer -Server $vCenterDnsName # Get the virtual machine $Vm = Get-VM -Name $VmName # Get the latest Replay Manager snapshot ( there should be only one ) $VmSnapshot = Get-Snapshot -VM $Vm ` | Where-Object { $_.
Additional resources } # Reset the volume Set-CMLDiskDevice -SerialNumber $DiskSerialNumber -ReadOnly:$False Set-CMLDiskDevice -SerialNumber $DiskSerialNumber -ResetSnapshotInfo Set-CMLDiskDevice -SerialNumber $DiskSerialNumber -Online Once all of the database volumes have been cleaned up and brought online, start the SQL Server service as well as any other services used with SQL Server (like the SQL Server Agent). For automated recovery, configure the recovery plan to power the virtual machine on.
Additional resources 4.5 Performing a disaster recovery test SRM provides the ability to test the recovery process without performing an actual failover. Virtual machines can be brought online in an isolated environment at the disaster recovery site. Because the test environment is isolated, recovered virtual machines can have the same name or IP address as production virtual machines. The production environment is not impacted by the test.
Additional resources If the primary site is still available, SRM will effectively perform a planned migration. SRM will power off the virtual machines at the primary site, create new crash consistent snapshots, and recover the volumes from those snapshots once they have been replicated to the recovery site. This will provide a failover with no data loss.
Additional resources A Additional resources A.1 Technical support and resources Dell.com/support is focused on meeting customer needs with proven services and support. Storage technical documents and videos provide expertise that helps to ensure customer success on Dell EMC storage platforms. A.2 Referenced or recommended publications Dell EMC publications: • • Dell EMC SC Series Arrays and Microsoft SQL Server: http://en.community.dell.