HP StoreOnce Backup System Concepts and Configuration Guidelines (BB877-90913, November 2013)

VTL and NAS Replication overview
Deduplication technology is the key enabling technology for efficient replication because only the
new data created at the source site needs to replicate to the target site once seeding is complete.
This efficiency in understanding precisely which data needs to replicate can result in bandwidth
savings in excess of 95% compared to having to transmit the full contents of a cartridge/share
from the source site. The bandwidth saving will be dependent on the backup data change rate at
the source site.
There is some overhead of control data that also needs to pass across the replication link. This is
known as manifest data, a final component of any hash codes that are not present on the remote
site and may also need to be transferred. Typically the overhead components are less than 2%
of the total virtual cartridge/file size to replicate.
Replication throughput can be “throttled” by using bandwidth limits as a percentage of an existing
link, so as not to affect the performance of other applications running on the same WAN link.
Key performance factors with replication
Key factors for performance considerations with replication:
Define your seeding (first replication) strategy before implementation several methods are
available depending on your replication model active/passive, active/active or Many-to-One.
See Seeding methods in more detail.
If a lot of similar data exists on remote office StoreOnce libraries, replicating these into a
single target VTL library will give a better deduplication ratio on the target StoreOnce Backup
system. Consolidation of remote sites into a single device at the target is available with VTL
device types. (Catalyst targets can also be used to consolidate replication from various source
sites into a single Catalyst store at a DR site.)
Replication starts when the cartridge is unloaded or the NAS share file is closed and when a
replication window is enabled. If a backup spans multiple cartridges or NAS files, replication
will start on the first cartridge/ file as soon as the job spans to the second, unless a replication
blackout window is in force.
Size the WAN link appropriately to allow for replication and normal business traffic taking
into account data change rates. A temporary increase in WAN speed may be desirable for
initial seeding process if it is to be performed over the WAN
Apply replication bandwidth limits or apply replication blackout windows to prevent bandwidth
hogging. The maximum number of concurrent replication jobs supported by source and target
StoreOnce appliances can be varied in the StoreOnce Management GUI to also manage
throughput and bandwidth utilization.
Catalyst Copy and deduplication
Catalyst Copy is the equivalent of Virtual library and NAS share replication. The same principles
apply in that only the new data created at the source site needs to be copied (replicated) to the
target site. The fundamental difference is that the copy jobs are created by the backup application
and can, therefore, be tracked and monitored within the backup application catalog as well as
from the StoreOnce Management GUI. Should it be necessary to restore from a Catalyst copy, the
backup application is able to restore from a duplicate copy without the need to re-import data to
the catalog database.
The key performance factors are the same as the replication performance factors.
Housekeeping
Housekeeping is an important process in order to maximize the deduplication efficiency of the
appliance. If data is deleted from the StoreOnce system (e.g. a virtual cartridge is overwritten or
erased), any unused chunks will be marked for removal, so space can be freed up (space
VTL and NAS Replication overview 15