User's Manual

NUMERIC PROGRAMMING EXAMPLES
Implementing each of these three steps requires attention to detail. To begin with, not all floating-point
values have a numeric meaning. Values such
as
infinity, indefinite,
or
Not a Number (NaN) may be
encountered by the conversion routine. The conversion routine should recognize these values and identify
them uniquely.
Special cases of numeric values also exist. Denormals, unnormals, and pseudo zero all have a numeric
value but should be recognized, because all of them indicate that precision
was
lost during some earlier
calculations.
Once it has been determined that the number has a numeric value, and it
is
normalized setting appro-
priate unnormal flags, the value must be scaled to the BCD range.
Scaling the Value
To scale the number, its magnitude must be determined.
It
is
sufficient
to
calculate the magnitude to
an accuracy of 1 unit,
or
within a factor of
10
of the given value. After scaling the number, a check
will
be made to see if the result falls in the range expected.
If
not, the result can be adjusted one
decimal order of magnitude up or
down.
The adjustment test after the scaling
is
necessary due to
inevitable inaccuracies
in
the scaling value.
Because the magnitude estimate need only be close, a fast technique
is
used. The magnitude
is
estimated
by multiplying the power of
2,
the unbiased floating-point exponent, associated with the number
by
log
102.
Rounding the result to an integer
will
produce an estimate of sufficient accuracy. Ignoring the
fraction value can introduce a maximum error of 0.32
in
the result.
Using the magnitude of the value and size of the number string, the scaling factor can be calculated.
Calculating the scaling factor
is
the most inaccurate operation of the conversion process. The relation
IOx=2**(X*log210)
is
used for this function. The exponentiate instruction
(F2XMl)
will
be used.
Due to restrictions
on
the range of values allowed
by
the
F2XMl
instruction, the power of 2 value
will
be split into integer and fraction components. The relation
2**(1
+
F)
=
2**1
* 2**F allows using
the FSCALE instruction to recombine the 2**F value, calculated through
F2XMl,
and the
2**1
part.
INACCURACY
IN
SCALING
The inaccuracy of these operations arises because of the trailing zeros placed into the fraction value
when stripping off the integer valued bits. For each integer valued bit
in
the power of 2 value separated
from the fraction bits, one bit of precision
is
lost
in
the fraction field due to the zero fill occurring
in
the least significant bits.
Up to
14
bits may be lost in the fraction because the largest allowed floating point exponent value
is
214-\.
AVOIDING UNDERFLOW AND OVERFLOW
The fraction and exponent fields of the number are separated to avoid underflow and overflow
in
calculating the scaling values. For example, to scale
10-
4932
to 10
8
requires a scaling factor of
lO
49S
o,
which cannot be represented by the NPX.
By
separating the exponent and fraction, the scaling operation involves adding the exponents separate
from multiplying the fractions. The exponent arithmetic
will
involve small integers, all easily repre-
sented by the NPX.
4-16