HP Data Protector Software Performance White Paper

ManualsBrandsHP ManualsSoftwareHP Data Protector V6.10 Software

During backup, these servers should not be under heavy load from other applications that run

I/O and CPU intensive operations, such as virus scans or a large number of database

transactions.

Backup servers demand special attention for proper sizing because they are central to the

backup process as they run the required agents. The data is passed into and out of the server’s

main memory as it is read from the disk subsystem or network and written to tape. The server

memory should be sized accordingly, for example, in the case of an online database backup,

where the database uses a large amount of memory. Backup servers that receive data from

networks rely also on fast connections. If the connection is too slow, a dedicated backup LAN or

moving to a SAN architecture could improve the performance.

Application servers without any backup devices depend basically on a good performance of

connected networks and disks. In some cases, file systems with millions of small files (such as

Windows NTFS) could be an issue.

Backup application

For database applications (such as Oracle, Microsoft SQL and Exchange), use the backup

integration provided by those applications, as they are tuned to make best use of their data

structures.

Use concurrency (multi-threading) if possible; this allows multiple backups to be interleaved to

the tape, thus reducing the effect of slow APIs and disk seeks for each one. Note that this can

have an impact on restore times as a particular file set is interleaved among other data.

File system

There is a significant difference between the raw read data rate of a disk and the file system

read rate. This is because traversing a file system requires multiple, random disk accesses

whereas a continuous read is limited only by the optimized data rates of the disk.

The difference between these two modes becomes more significant as file size reduces. For file

systems where the files are typically smaller than 64 KB, sequential backups (such as from raw

disk) could be considered to achieve the required data rates for high-speed tape drives.

File system fragmentation could be also an issue. It causes additional seeks and slower

throughput.

Disk

If the system performing the backup has a single hard disk drive, the major factor restricting

backup performance is most likely to be the maximum transfer rate of the single disk. In a typical

environment, maximum disk throughputs for a single spindle can be as low as 8 MB/s.

High capacity disk

A high capacity disk is still one spindle with its own physical limitations. Vendors tend to

advertize them as “best price per MB,” but a single spindle can cause serious problems in high

performance environments.

Two smaller spindles provide twice the performance of one large spindle. The backup

performance of large disks may be acceptable without any application load. But if an application

writes in parallel to that disk, the total disk performance can go below 5 MB/s and the hit ratio of

a disk array read cache below 60%.

Disk Array

Benchmarks have shown that theoretical disk array performance cannot be achieved with

standard backup tools. The problem lies in the concurrency of read processes, which cannot be

distributed equally among all I/O channels and disk drives. The disk array can be seen as a

bunch of disks, where the internal organization and configuration is hidden for the backup

software. High capacity disks can cause additional problems that intelligent disk array caches

cannot overcome. They are not able to provide reasonable throughput for backup and restore

tasks; the number of sequential reads and writes is too high.

The 50% backup performance rule became a standard for disk array sizing.