User's Manual

General tuning techniques 17
protocols and have long pipelines, the IO Accelerator does not suffer from major latency increases as the
number of outstanding I/Os increases.
The primary methods for generating outstanding I/Os are:
Using multiple threads
Using multiple processes
Using AIO
For small-packet IOPS-geared applications, having multiple threads or outstanding AIO requests generally
yields a significant performance improvement over a single thread. For larger block size bandwidth-oriented
applications, having multiple outstanding I/Os is less important.
Pre-conditioning
Unlike traditional storage, the characteristics of writes issued to a solid state storage device can affect the
performance of both future write and read operations. Some of the more interesting characteristics to
consider are the size of individual writes (the block size or record size), the order in which writes are
performed, and the block size used to read the data back. Providing the details for this is outside the scope
of this document. The most common pre-conditioning issues are addressed here.
The fio-format command reinitializes the data on the IO Accelerator to an empty state. This eliminates all
history of the data writes on the drive as well as removing all data. Deleting this history might initially cause
higher performance results for both reads and writes. Ensure that the application and benchmark have had
time to stabilize at a performance level.
Read performance can be artificially boosted when reads are performed from previously unwritten sectors.
After fio-format is complete, any sector that is read before data is written to it returns all binary zeros
(0x0). It returns data at an accelerated rate when compared to data read from a sector that has previously
had data written. This behavior is the same that filesystems and operating systems use to manipulate sparse
files. The read performance achieved from these uninitialized sectors is not indicative of IO Accelerator
real-world read performance and should be disregarded. The published numbers from HP disregard this
performance acceleration.
To avoid measuring invalid read performance, ensure that you write data to each sector that will be used in
benchmarking. In Linux, the entire device can be easily written to using the dd command:
CAUTION: The dd command destroys all data on the drive.
$ dd if=/dev/zero of=/dev/fioX bs=10M oflag=direct
Under Windows® operating systems, when a raw block test is being used, that same test can generally be
used to write data to the device before testing. Testing that is run on top of a filesystem must first populate the
data and cannot be affected by this artificial performance boost.
If an application writes in a smaller block size than it uses to read the data back, the read bandwidth might
be constrained to the maximum bandwidth achievable at a block size equivalent to the original write block
size.
For example, if an application performs random 512 byte writes and then reads the data back using 4 KiB,
the performance might be limited to that of issuing 512 byte reads directly (the IO Accelerator is IOP limited
rather than bandwidth limited.)
The most common ways to reset a device state are: