HP Compilers for HP Integrity Servers (September 2011)

to create applications which are suitable for any Itanium processor, using the
+DS{blended|itanium2|montecito|poulson|native} compiler option.
The default option +DSblended specifies code scheduling that runs reasonably
well on all implementations.
The +DSpoulson, +DSitanium2, and +DSmontecito options select code
optimized for these processors.
The +DSmontecito option also selects code optimized for the Montvale and Tukwila
implementations.
The +DSnative option asks the compiler to schedule code optimally for the system
type on which compilation is occurring.
HP will add new choices to the +DS series of options as new processors are introduced.
Use these options with care. An application compiled for one processor may run
sub-optimally on another processor. Code scheduled for the Intel Itanium 2
(+DSitanium2) processor may run noticeably slower (5–40%) on an Intel Itanium
processor than code compiled with the +DSitanium option. The relative performance
difference will vary with the application; floating-point intensive codes tend to be more
sensitive to the scheduling model than integer codes. The +DSblended scheduling model
is a hybrid model that attempts to generate code that runs reasonably well on all existing
implementations, and it will continue to evolve as new Itanium implementations are
released. In the AR1003 compilers, the +DSblended model is a combination of the
+DSmontecito and +DSpoulson models.
It might be necessary to recompile applications for a future member of the Intel Itanium
processor family in order to obtain optimal performance. Binary compatibility, however,
is assured regardless of the choice of scheduling option.
Choosing the link mode
By default, all HP compilers assume the -dynamic option. The resulting object file uses
dynamic linking and can be included in a shared library. When the object file will be
linked into an executable rather than a shared library, the option -exec is appropriate.
The -exec option tells the compiler that all defined global symbols are resolved within
the executable itself, usually resulting in faster loads and stores. The option -minshared
directs the compiler to use archive libraries (when available), rather than shared libraries,
to potentially improve performance. It tells the compiler that all symbols will be resolved
within the executable itself, except for those symbols declared with the appropriate
pragmas in system header files.
Increasing the page size
If your application incurs a high data or instruction TLB miss rate, requesting a larger
virtual memory page size for data or instructions can provide an additional performance
gain. HP Caliper can tell you if your application is experiencing a high TLB miss rate.
Application tuning 25