Specifications

NetApp Deduplication for FAS and V-Series Deployment and Implementation Guide
33
5 DEDUPLICATION AND VMWARE
VMware environments deduplicate extremely well. However, while working out the VMDK and data store
layouts, keep the following points in mind:
Operating system VMDKs deduplicate extremely well because the binary files, patches, and drivers are
highly redundant between virtual machines (VMs). Maximum savings can be achieved by keeping these
in the same volume.
Application binary VMDKs deduplicate to varying degrees. Duplicate applications deduplicate very well;
applications from the same vendor commonly have similar libraries installed and deduplicate somewhat
successfully; and applications written by different vendors don't deduplicate at all.
Application data sets when deduplicated have varying levels of space savings and performance impact
based on application and intended use. Careful consideration is needed, just as with nonvirtualized
environments, before deciding to keep the application data in a deduplicated volume.
Transient and temporary data such as VM swap files, pagefiles, and user and system temp directories
do not deduplicate well and potentially add significant performance pressure when deduplicated.
Therefore NetApp recommends keeping this data on a separate VMDK and volume that are not
deduplicated.
Data ONTAP 7.3.1 includes a performance enhancement referred to as warm cache extension for zero
blocks. This is particularly applicable to VM environments, where multiple blocks are set to zero as a
result of system initialization. These zero blocks are all recognized as duplicates and are deduplicated
very efficiently. The warm cache extension enhancement provides increased sequential read
performance for such environments, where there will be very large amounts of deduplicated blocks.
Examples of sequential read applications that will benefit from this performance enhancement include
NDMP, SnapVault, some NFS-based application, and dump. This performance enhancement is also
beneficial to the boot-up processes in VDI environments.
The expectation is that about 30% space savings will be achieved overall. This is a conservative number,
and in some cases users have achieved savings of up to 80%. The major factor that affects this percentage
is the amount of application data. New installations typically deduplicate extremely well, because they do not
contain a significant amount of application data.
Important: In VMware, the need for proper partitioning and alignment of the VMDKs is extremely important
(not just for deduplication). VMware must be configured so that the VMDKs are aligned on WAFL 4K block
boundaries as part of a standard VMware implementation. To help prevent the negative performance impact
of LUN/VMDK misalignment, read TR-3428, NetApp and VMware Best Practices Guide,‖ at
http://media.netapp.com/documents/tr-3428.pdf. Also note that the applications in which the performance is
heavily affected by deduplication (when these applications are run without VMware) are likely to suffer the
same performance impact from deduplication when they are run with VMware.
A deduplication and VMware solution on NFS is easy and straightforward. Combining deduplication and
VMware with LUNs requires a bit more work. For more information on this, see section 4.10, ―Deduplication
and LUNs.‖
The following subsections describe the different ways that VMware can be configured. For more information
about NetApp storage in a VMware environment, see TR-3428, NetApp and VMware Virtual Infrastructure 3
Storage Best Practices .
5.1 VMFS DATA STORE ON FIBRE CHANNEL OR ISCSI: SINGLE LUN
This is the default configuration, and it’s the way that a large number of VMware installations are done
today. Deduplication occurs across the numerous VMDKs.