Advanced Backup to Disk Performance White Paper

Basics of Backup Performance
Backup performance will always be limited by one or more bottlenecks in the system, of which the
tape drive is just one part. The goal is to make the tape drive the bottleneck. That way the system will
achieve the performance figures as advertised on the drive's specification sheet.
Please note that backup jobs can stress hardware resources up to their highest limit, which would
never happen during normal application load. This puts the emphasis on the rest of the system and
causes failures, which are based in the involvement of many components and their time-critical
handshake of data.
The flow of data throughout the system must be fast enough to provide the tape drive with data at its
desired rates. High-speed tape drives, such as the Ultrium 960, are so fast that this can be a difficult
goal to achieve. If the backup performance is not matching the data sheet of the tape drive, then there
is a bottleneck somewhere else in the system.
One single component, like the 100BASE-T network, can decrease a SDLT or LTO tape drive
performance to a very low transfer rate (this would be a very good use case for first staging the data
on disk and then backing it up to tape.)
All components must be considered for getting the theoretical backup performance. Practical
performance data can only be obtained from benchmarks.
Factors, which critically affect the backup speed:
Multiplexing
This allows better bandwidth utilization of the tape drive during backup but can slow down restore
performance because all the data is interleaved all the way down the tape. Therefore, the time
spent to perform a single stream restore is higher due to other streams having to be read (and
potentially ignored).
Disk and Tape Buffers
DP offers a set of advanced options for backup devices and disk agents. The default settings are
device-based and match most of all backup environments. Ultrium 960 is an exception and requires
a modification as described in chapter
Tuning Recommendations.
Data File Size
The larger the number of smaller files there are, the larger the overhead there is associated with
backing them up. The worst-case scenario for backup is large numbers of small files due to system
overhead of file access.
Data compressibility
Incompressible data will back up slower than higher compressible data. JPEG files, for example,
are not very compressible, whereas database files can be highly compressible. The accepted
standard for quoting tape backup specifications revolves around an arbitrary figure of 2:1
compressible data.
Disk Array Performance
It is often overlooked that you cannot put data on tape any faster than you can read it from disk.
Backup is more sequential in nature than random (from a disk array access perspective). Disk array
performance depends on the number of disks, RAID configuration, the number of Fibre Channel
ports to access the array, queue depth available, and so on. HP has written several utilities to “read
data” from disk arrays and deliver a performance value. This enables users to determine the
throughput of their disk arrays when operating in backup mode and performing file system
traversals typical of this activity.
These Performance Assessment Tools (PAT utilities) can be downloaded from
http://www.hp.com/support/pat. The performance tools are also embedded within the HP industry-
leading Library and Tape Tools diagnostics, which can be downloaded from
http://www.hp.com/support/tapetools.
8