HP Compilers for HP Integrity Servers (September 2011)

The high level scalar optimizer performs expression simplification and canonicalization,
SSA-based dead code removal, copy propagation, constant propagation, and register
promotion, as well as control flow optimizations and basic block cloning.
The interprocedural optimization framework (enabled with -ipo at optimization level
+O2 or higher) has been designed to scale to very large applications.
Fortunately, nothing changes from a user’s perspective; in particular, existing build
processes do not have to be modified. Since the IPO and code generation are performed
at link time, the link time may increase significantly.
The internal build model differs slightly from the default build model and is illustrated in
Figure 2 (page 19) (simplified for clarity).
The object files generated with -ipo contain an intermediate representation of the user
code; these object files are called IELF files. IELF files have been designed for fast access.
Compared to regular object files containing debug information (option -g), IELF files are
typically larger by a factor of 3x. Compared to object files containing no debug
information, IELF files can be significantly larger, as they have to include, for example,
complete type information. This can be a strain on file systems for very large applications.
The utility elfdump allows determining whether a given object file is an IELF file (generated
with -ipo) or a real object file: the option -f, which displays the ELF file header, will
report a file type of “HP_IFILE” for IELF files.
IELF files are not guaranteed to be compatible from one compiler release to the next. If
your application attempts to make use of old IELF files, a full recompilation of your
application may be necessary after a compiler update.
IELF files are consumed by the interprocedural analysis and optimization phase which
as result generates a set of final temporary IELF files containing the transformation results.
Great care has been taken to minimize the amount of core memory needed during IPO
and to ensure that the fastest algorithms are chosen (see “Reference 11” (page 35) and
“Reference 12” (page 35)).
The IPO phase also generates a temporary Makefile containing targets for translating
the temporary IELF files into real object files. This translation is done with a standalone
backend called be, which contains the code generator and the low-level optimizer (in
addition to the high level optimizer). The IPO phase executes make on the generated
Makefile in parallel mode, generating final object files required for the link of the
application. This mechanism is transparent to the user.
The default number of parallel be processes is set to the number of processors on a
machine. This number can be overridden by setting the environment variable PARALLEL
(see the man pages for make for more details).
This parallelization speeds up the time spent in code generation and low-level optimization
greatly on machines with multiple processors. For several serial build processes (no
parallel make for the frontend parts) +O4 has been observed to be faster than +O2.
20 HP compilers for HP Integrity servers