Specifications

NetApp Deduplication for FAS and V-Series Deployment and Implementation Guide
13
On flexible volume dvol_2, deduplication is scheduled to run every day from Sunday to Friday at 11
p.m.
On flexible volume dvol_3, deduplication is set to autoschedule. This means that deduplication is
triggered by the amount of new data written to the flexible volume, specifically when there are 20% new
fingerprints in the change log.
On flexible volume dvol_4, deduplication is scheduled to run at 6 a.m. on Saturday.
When the -s option is specified, the command sets up or modifies the schedule on the specified flexible
volume. The schedule parameter can be specified in one of four ways:
[day_list][@hour_list]
[hour_list][@day_list]
-
auto
The day_list specifies which days of the week deduplication should run. It is a comma-separated list of
the first three letters of the day: sun, mon, tue, wed, thu, fri, sat. The names are not case sensitive. Day
ranges such as mon-fri can also be used. The default day_list is sun-sat.
The hour_list specifies which hours of the day deduplication should run on each scheduled day. The
hour_list is a comma-separated list of the integers from 0 to 23. Hour ranges such as 8-17 are allowed.
Step values can be used in conjunction with ranges. For example, 0-23/2 means "every 2 hours." The
default hour_list is 0; that is, midnight on the morning of each scheduled day.
If "-" is specified, there is no scheduled deduplication operation on the flexible volume.
The auto schedule causes deduplication to run on that flexible volume whenever there are 20% new
fingerprints in the change log. This check is done in a background process and occurs every hour.
When deduplication is enabled on a flexible volume for the first time, an initial schedule is assigned to the
flexible volume. This initial schedule is sun-sat@0, which means "once every day at midnight."
To configure the schedules shown earlier in this section, the following commands would be issued:
toaster> sis config -s - /vol/dvol_1
toaster> sis config -s 23@sun-fri /vol/dvol_2
toaster> sis config s auto /vol/dvol3
toaster> sis config s sat@6 /vol/dvol_4
3 SIZING FOR PERFORMANCE AND SPACE EFFICIENCY
This section discusses the deduplication behavior that you can expect. Information in this section comes
from testing, observations, and knowledge of how deduplication functions.
3.1 DEDUPLICATION GENERAL BEST PRACTICES
This section contains deduplication best practices and lessons learned from internal tests and from
deployments in the field.
Deduplication consumes system resources and can alter the data layout on disk. Due to the
application’s I/O pattern and the effect of deduplication on the data layout, the read and write I/O
performance can vary considerably. The space savings and the performance impact vary significantly
depending on the application and the data contents.
NetApp recommends that the performance impact due to deduplication be carefully considered and
measured in a test setup and taken into sizing considerations before deploying deduplication in
performance-sensitive solutions. For more information on the impact of deduplication on other
applications, contact the specialists at NetApp for their advice and test results of your particular
application with deduplication.
If there is a small amount of new data, run deduplication infrequently, because there’s no benefit in
running it frequently in such a case, and it consumes CPU resources. How often you run it depends on
the rate of change of the data in the flexible volume.