User's Manual

46
DAT 72/DDS-4 Product Manual
DCLZ Algorithm
Within the computer industry, algorithms developed by Abraham Lempel and Jacob
Ziv (enhanced later by Terry Welch) are popular, versatile and powerful compression
methods. These LZ algorithms are basically of two types—LZ1, a sliding window
method, and LZ2/LZW, a hashed directory method.
LZ2
and
LZW
(Lempel-Ziv-Welch) are algorithms based on the
hashed dictionary
method; these algorithms offer an acceptable compromise between speed and
compression ratio. This type of algorithm builds a symbol dictionary to represent
strings as the data is processed and then looks up matching patterns in the
dictionary. By monitoring the compression ratio in this type of algorithm, a new
dictionary can be started when the ratio drops, indicating a change in the data type.
This type of algorithm is responsive to changing data patterns while maintaining
acceptable speed.
Although dependent on the particular implementation, the LZ2/LZW type of algorithm
is generally faster than the LZ1 type because the dictionary structure promotes
efficient searching.
The DCLZ algorithm used in the DDS tape drive is based on the LZ2/LZW algorithm
type described earlier in this chapter. This algorithm has been approved by the US
ANSI standards group and the European ECMA standards group. Both the DDS
Manufacturers Group and QIC tape industry-standards committees accept DCLZ as
an approved standard. Within the DDS Manufacturers Group, DCLZ is the only
approved standard, ensuring complete interchange across all DDS drives and media.
Simplified Compression Operation
The following steps describe a simplified version of operation of the algorithm for
compressing data:
1.
From the current position in the input data stream, the algorithm fetches bytes
(characters) until a string is formed that does not have a matching entry in the
dictionary.
2.
The codeword for the longest string that has an entry in the dictionary (all bytes
except the last) is output.
3.
A dictionary entry for the string formed in step 1 is created.
4.
The current position is moved to the last byte of that string.
5.
Steps 1 through 4 are repeated until the input data stream is completely
processed.