Administrator Guide

About Data Reduction
The FluidFS cluster supports two types of data reduction:
Data deduplication – Uses algorithms to eliminate redundant data, leaving only one copy of the data to be stored. The FluidFS cluster
uses variable-size block level deduplication as opposed to le level deduplication or xed-size block level deduplication.
Data compression – Uses algorithms to reduce the size of stored data.
When using data reduction, note the following limitations:
The minimum le size to be considered for data reduction processing is 65 KB.
Because quotas are based on logical rather than physical space consumption, data reduction does not aect quota calculations.
If you disable data reduction, data remains in its reduced state during subsequent read operations by default. You can enable rehydrate-
on-read when disabling data reduction, which causes a rehydration (the reversal of data reduction) of data on subsequent read
operations. You cannot rehydrate an entire NAS volume in the background, although you could accomplish this task by reading the
entire NAS volume.
Cross-volume deduplication is not supported at this time.
Data reduction does not support base clone and cloned volumes.
Table 15. Data Reduction Enhancements in FluidFS v6.0 or later
FluidFS v6.0 or later FluidFS v5.0 or earlier
Data reduction is enabled on a per-NAS-cluster basis. Data reduction is enabled on a per-NAS-volume basis.
Data reduction supports deduplication of les that are created or
reside on dierent domains.
Data reduction is applied per NAS controller, that is, the same
chunks of data that are owned by the dierent NAS controllers are
not considered duplicates.
The distributed dictionary service detects when it reaches almost
full capacity and doubles in size (depending on available system
storage).
The dictionary size is static and limits the amount of unique data
referenced by the optimization engine.
Date Reduction Age-Based Policies and Archive Mode
By default, data reduction is applied only to les that have not been accessed or modied for 30 days to minimize the impact of data
reduction processing on performance. The number of days after which data reduction is applied to les is congurable using
Storage
Manager.
The default number of days is set to 30. When using FluidFS v5 or earlier, you can change the default to as low as 5 days, and you can start
data reduction processing immediately (archive mode). Starting with FluidFS v6, there is no archive mode available. You can set the
Exclude Files Accessed in the Last and Exclude Files Modied in the Last defaults to 1 day instead of using archive mode.
For more information about enabling and disabling archive mode, see the Dell FluidFS FS8600 Appliance CLI Reference Guide.
Data Reduction Considerations
Consider the following factors when enabling data reduction:
Data reduction processing has a 5-20% impact on the performance of read operations on reduced data. It does not have any impact on
write operations or read operations on normal data.
Storage Center data progression is impacted. After data reduction processing, the Storage Centermigrate reduced data up to Tier 1
disks.
Increased internal trac during data reduction processing.
Data is rehydrated for antivirus scanning.
440
FluidFS Administration