Dell PS Series Snapshots and Clones: Best Practices and Sizing Guidelines Dell Storage Engineering November 2019 Dell EMC Best Practices
Revisions Date Description May 2012 Initial release December 2016 Minor updates November 2019 vVols branding update The information in this publication is provided “as is.” Dell Inc. makes no representations or warranties of any kind with respect to the information in this publication, and specifically disclaims implied warranties of merchantability or fitness for a particular purpose.
Table of contents 1 Introduction ...................................................................................................................................................................6 1.1 Audience .............................................................................................................................................................6 1.2 Terminology ..........................................................................................................................
7.3 Clones versus snapshots .................................................................................................................................31 7.4 Potential causes of unexpected snapshot growth ............................................................................................31 7.5 Monitoring snapshot reserve ............................................................................................................................32 8 Summary ...........................
Executive summary Today’s storage administrators are tasked with ever evolving challenges in ensuring the storage resources being requested of them by their customers and their customers applications is resilient from recovery actions and highly available when needed while being aligned to the organizations RTO/RPO (Recovery Time Object/Recovery Point Object) policy.
1 Introduction The lab validated best practices in this paper will help IT and SAN administrators understand and fully utilize the powerful snapshot and clone features delivered with every PS Series array.
Note: For additional details and definitions see the Dell PS Series Configuration Guide.
2 PS Series storage With its unique peer storage architecture, PS Series arrays deliver high performance and availability in a flexible environment with low cost of ownership. PS Series storage solutions deliver the benefits of consolidated networked storage in a self-managing, iSCSI storage area network that is affordable and easy to use, regardless of scale.
3 Protecting data with snapshots and clones 3.1 Snapshots Snapshots are point-in-time copies of volumes that capture the contents of a volume at a specific point in time and are often used to recover data lost by events such as human error, viruses, or database corruption. They can also be used for testing or to create a source for backup to tape or another disk. The creation of a snapshot is done without disrupting normal host access to the volume.
3.2 Clone volumes A clone volume is a full copy of an existing volume. The clone volume has the same reported size, contents, and thin-provision settings as the original volume. A clone volume can be created from a regular volume, a specific replica of a volume, or a specific snapshot of a volume. A thin clone can also be created from a template volume. Thin clones are sub-volumes of template volumes where the template is a read-only version of the volume at a specific point in time.
3.
3.4 Storage requirements and configuration limits In a PS Series group, snapshots can be used to protect data against human error, viruses, or database corruption. PS Series storage administrators can create a point-in-time copy or multiple copies of the base volume, which can be retained for data protection or made accessible to another host. Creating snapshots can be performed while the base volume remains online, therefore, users and applications are not disrupted.
The following table shows the supported configuration limits for arrays running the 9.0.x firmware release. Configuration limits for groups of PS4XXX class arrays are shown separately. Refer to the release notes of the current firmware for the most current limits and other important information. Supported configuration limits 1 3.
new data from cache to the physical disks. Reducing unnecessary workload on the disks reduces latency and results in more available resources for the system to meet an application’s requirement. Snapshot changes Snapshot reserve can never grow beyond 100% of the base volume for a single snapshot. However, if there are multiple snapshots, the reserve usage may be greater than the size of the base volume.
action is to delete the oldest snapshot, providing the necessary free space to create a new snapshot. However, the user can set a policy for recovering snapshot space with the following options: • • Delete oldest snapshot (default). Set the base volume and its snapshots offline. If a snapshot has active iSCSI connections, they will be terminated before the snapshot is deleted or before the volume and its snapshots are set offline.
3.9 Snapshot schedules Using Dell Storage Manager, Dell EqualLogic Group manager or Auto-Snapshot manager, an administrator can create a schedule for taking snapshots (or replicas) of a volume or a volume collection. The schedule can execute on an hourly or daily basis and the frequency of snapshots (or replicas) can be set to occur once or at regular intervals (from five minutes to 12 hours apart).
ASM/ME provides both a graphical user interface and a command line interface for manual or scripted operations. ASM/ME also supports popular Microsoft applications such as SQL Server®, Exchange®, SharePoint®, and Hyper-V virtual environments. Besides single volume operations, ASM also supports collections. Snapshot, clone, and replica operations can also be scheduled using ASM.
3.11 Supported versions of the Windows operating system The below tables list the supported versions of the Windows operating system and identifies the host integration tools components that do not support specific operating system versions. Host integration tools do not support any evaluation version of Windows or any version not listed in the below tables. Windows Desktop Operating System Support Component Windows 7 SP1 Windows 8.0 and 8.
4 Backup and recovery operations 4.1 Restoring from a snapshot There are times when data in the base volume needs to be restored to a particular point in time. PS Series snapshots provide several quick restore options to maximize the availability of critical data. • • A full volume can be quickly restored by setting a snapshot online (the host server must be powered off or the volume must be set offline before doing this).
A thin clone volume that was created from a template volume is dependent on the associated template volume. Deleting a thin clone does not delete the template volume it is associated with. Before deleting a template volume, all thin clones that depend on it must first be deleted or converted to regular volumes. If an administrator wants to preserve one of the thin clones, a full clone volume (essentially a copy) can be created from that thin clone before it is deleted. 4.
5 Snapshot and clone testing Several tests were performed to evaluate the behavior of snapshots and clones. In particular, the utilization of snapshot reserve space was monitored under varying load conditions and data access patterns, such as sequential versus random access. The impact on the source volume when using snapshots for back-up was also tested. The following sections explain the architecture, topology, and test scenarios in addition to how the tests were actually performed.
Both of the SAN attached hosts had the Dell EqualLogic Host Integration Tools (HIT) for Microsoft Edition loaded and utilized the Microsoft software iSCSI initiator and Multi-Path I/O (MPIO). The HIT Kit installed the EqualLogic Device Specific Module (DSM) for Windows which automatically created the optimal number of iSCSI connections for the volume at the default settings. 5.
6 Test results and analysis This section details the various test scenarios performed and explains the results of each test. 6.1 Effect of block size and random vs. sequential I/O pattern Data warehouse and business intelligence applications typically have a higher percentage of sequential I/O operations as well as a higher percentage of disk reads. Online transaction processing (OLTP) workloads are typically more random in nature and disk writes may be 30-50% of the total I/O operations.
another must be allocated to the snapshot volume (or reserve). However, when a random I/O pattern is used, additional pages are allocated for reserve more quickly as I/O’s are not always adjacent and may modify a different page each time. As each subsequent I/O occurs to pages that have already been allocated, the rate of consumption slows and ultimately levels off.
Snapshot reserve utilization from progressive snapshots and increasing % volume access After the first snapshot, the snapshot reserve usage was about 1 GB (10% of the 10 GB base volume). Since only 10% of the volume was accessed, all the pages making up that 10% were affected, resulting in the same amount to be consumed by snapshot reserve space. In other words, the data in 10% of the base volume was completely changed. After the second snapshot, about 3 GB is consumed in snapshot reserve space.
6.3 Backup from a snapshot and clone volume Users often use snapshots to create backups of their data. During this process there is often an application workload running on the source volumes. In this test, the performance impact on the source volume was analyzed while snapshots and clones were being backed up. During this time, the source volume was under load.
very low until the worker threads were increased to 32. In both cases, this was the effect of increasing the overall workload on the storage system. Backup of clone volume with workload 6.4 Backup from multiple snapshot volumes In the previous section, it was observed that the performance impact (latency) on the source volume was very low while performing a backup from a single snapshot volume.
IOmeter workload types Workload type # workers I/O type Read/Write Mix Block Size Volume Volume Access Backup 1 Sequential 100/0 128K Snapshot 100% Application 8 Random 67/33 8K Base 10% The read/write mix indicates the ratio of read to write on the source volume when referring to the application workload, and on the snapshot volume when referring to the backup workload. Volume access is the amount of the base volume or snapshot volume the workload was allowed to access.
Using snapshot volumes as the source for backups may affect the performance of the source volumes. In this case, the latency increased with the application simulation workload and the number of volumes being backed up (i.e., running the backup simulation workload). IOPS continued to scale, indicating the I/O processing abilities of the storage system had not yet been reached.
7 Planning and design best practices 7.1 Use ASM for Windows and VSM for VMware To ensure that consistent snapshot and clones are created for Windows, ASM/ME must be used so that VSS is invoked before the snapshot is performed by the underlying hardware (the storage arrays). When a clone volume is created, ASM will call the VSS writer to freeze I/O and flush any unwritten data prior to creating the clone volume, resulting in an application consistent clone.
Total Snapshot Reserve = (∆ snapshot 1) + (∆ snapshot 2) + (∆ snapshot 3) + … (∆ snapshot n) For example, if 1 GB of data changes between snapshots of a volume and the administrator wishes to retain 5 snapshots, then at least 5 GB of snapshot reserve is needed. Because the data change rate may fluctuate slightly. Allocate slightly more than 5 GB to the snapshot reserve to ensure that all five are retained. 7.
7.5 Monitoring snapshot reserve As mentioned in the section titled, “Snapshot reserve”, the default setting allocates snapshot reserve equal to 100% of the volume size (reserve). The default settings also include a warning threshold of 90%. Both of these defaults can be changed to affect new volumes at any time, either globally, or per volume. Existing volumes will not be affected.
8 Summary The snapshot and clone features of PS Series storage arrays provide a set of powerful data recovery features delivered with every PS Series storage array. The information presented in this whitepaper can be used to assist administrators in achieving the most from these features.
A Solution infrastructure hardware and software versions PowerEdge R710 Windows 2008 R2 SP1 HIT kit 3.5.1 BIOS 6.0.7 LOM firmware 6.4.4 Broadcom BCM5709C Driver 6.2.9.0 iDRAC firmware 1.70 Lifecycle Controller firmware 1.50.671 Intel X520 Driver 2.5.52.2 Intel X520 firmware 12.5.2 PS6010XV Controller firmware 5.1.2 450GB 15K SAS drive firmware ERHB PS6010E Controller firmware 5.1.2 1TB 7.2K SATA drive firmware KD03 PowerConnect 8024F Switch firmware 4.1.0.
B Additional resources Dell.com/support is focused on meeting customer needs with proven services and support. Storage technical documents and videos provide expertise that helps to ensure customer success on Dell EMC storage platforms.