Specifications

NetApp Deduplication for FAS and V-Series Deployment and Implementation Guide
24
Deduplication can be enabled, run, and managed only from the primary location. However, the flexible
volume at the secondary location inherits all the deduplication attributes and storage savings using
SnapMirror.
Shared blocks are transferred only once, so deduplication reduces network bandwidth usage too.
The volume SnapMirror update schedule is not tied to the deduplication schedule.
The maximum volume size limit is imposed based on the lower maximum volume size limit of the
source and destination volumes.
When configuring volume SnapMirror and deduplication, it is important to consider the deduplication
schedule and the volume SnapMirror schedule. As a best practice, start volume SnapMirror transfers of a
deduplicated volume after deduplication has completed (that is, not in the middle of the deduplication
process). This is to avoid sending undeduplicated data and additional temporary metadata files over the
network. If the temporary metadata files in the source volume are locked in Snapshot copies, they also
consume extra space in the source and destination volumes.
Volume SnapMirror performance degradation can increase with deduplicated volumes. This extra overhead
needs to be accounted for when sizing the storage solution. For more information, see the section
Deduplication Performance.‖
The Impact of Moving Deduplication Metadata Files Outside the Volume
Starting with Data ONTAP 7.3, most of the deduplication metadata resides in the aggregate outside the
volume. Therefore it does not get captured in Snapshot copies, and volume SnapMirror does not replicate
this data. This provides additional network bandwidth savings. However, some temporary metadata files are
still kept inside the volume and are deleted when the deduplication operation completes. If Snapshot copies
are created during the deduplication operation, these temporary metadata files are locked in Snapshot
copies, so a volume SnapMirror update that is initiated during a deduplication process transfers these
temporary metadata files over the network. To prevent this extra data from being replicated, schedule the
volume SnapMirror updates to take place after the deduplication operation has finished running on the
source volume.
In case of a disaster at the primary location, you may need to break the volume SnapMirror relationship and
have the volume SnapMirror destination start serving data. In this case, there is no fingerprint database file
at the destination for the existing data on the destination volume. However, the existing data retains the
space savings from the deduplication operations performed earlier on the original volume SnapMirror
source. Also, the deduplication process continues for new data being written to the volume and creates the
fingerprint database for this new data. The deduplication process obtains space savings in the new data only
and doesn’t deduplicate between the new data and the old data. To run deduplication for all the data in the
volume (and thus obtain higher space savings), use the sis start -s command. This command builds
the fingerprint database for all the data in the volume. Depending on the size of the logical data in the
volume, this process may take a long time to complete.
Important: Before using the sis start -s command, make sure that both the volume and the aggregate
containing the volume have sufficient free space to accommodate the addition of the deduplication
metadata. For information about how much extra space to leave for the deduplication metadata, see section
Deduplication Metadata Overhead.
REPLICATING WITH QTREE SNAPMIRROR
When using qtree SnapMirror with deduplication, remember the following points:
Deduplication can be enabled on the source system, the destination system, or both.
Both the deduplication license and the SnapMirror license must be installed on the system where
deduplication is required.
Unlike volume SnapMirror, no network bandwidth savings are obtained with qtree SnapMirror, because
the source system sends undeduplicated data to the destination system, even if deduplication is
enabled on the source system.
The deduplication schedule is not tied to qtree SnapMirror updates on either the source or the
destination. However, a deduplication schedule can be set up independently of the qtree SnapMirror
schedule. For example, on the destination, the deduplication process does not automatically start at the
completion of qtree SnapMirror transfers.