user manual

126 Complex Number Arithmetic
AMD Athlon Processor x86 Code Optimization
22007E/0November 1999
Complex Number Arithmetic
Complex numbers have a real part and an imaginary part.
Multiplying complex numbers (ex. 3 + 4i) is an integral part of
many algorithms such as Discrete Fourier Transform (DFT) and
complex FIR filters. Complex number multiplication is shown
below:
(src0.real + src0.imag) * (src1.real + src1.imag) = result
result = (result.real + result.imag)
result.real <= src0.real*src1.real - src0.imag*src1.imag
result.imag <= src0.real*src1.imag + src0.imag*src1.real
Example:
(1+2i) * (3+4i) => result.real + result.imag
result.real <= 1*3 - 2*4 = -5
result.imag <= 1*4i + 2i*3 = 10i
result = -5 +10i
Assuming that complex numbers are represented as two
element vectors [v.real, v.imag], one can see the need for
swapping the elements of src1 to perform the multiplies for
result.imag, and the need for a mixed positive/negative
accumulation to complete the parallel computation of
result.real and result.imag.
PSWAPD performs the swapping of elements for src1 and
PFPNACC performs the mixed positive/negative accumulation
to complete the computation. The code example below
summarizes the computation of a complex number multiply.
Example:
;MM0 = s0.imag | s0.real ;reg_hi | reg_lo
;MM1 = s1.imag | s1.real
PSWAPD MM2, MM0 ;M2 = s0.real | s0.imag
PFMUL MM0, MM1 ;M0 = s0.imag*s1.imag |s0.real*s1.real
PFMUL MM1, MM2 ;M1 = s0.real*s1.imag | s0.imag*s1.real
PFPNACC MM0, MM1 ;M0 = res.imag | res.real
PSWAPD supports independent source and result operands and
enables PSWAPD to also perform a copy function. In the above
example, this eliminates the need for a separate MOVQ MM2,
MM0 instruction.