HP C Programmer's Guide (92434-90009)

76 Chapter4
Optimizing HP C Programs
Controlling Specific Optimizer Features
The +O[no]loop_transform option enables [disables] transformation of eligible loops for
improved cache performance. The most important transformation is the reordering of
nested loops to make the inner loop unit stride, resulting in fewer cache misses.
+Onoloop_transform may be a helpful option if you experience any problem while using
+Oparallel.
+O[no]loop_unroll[=unroll factor]
Optimization level(s): 2, 3, 4
Default: +Oloop_unroll
The +Oloop_unroll option turns on loop unrolling. When you use +Oloop_unroll, you can
also use the unroll factor to control the code expansion. The default unroll factor is 4, that
is, four copies of the loop body. By experimenting with different factors, you may improve
the performance of your program.
+O[no]moveflops
Optimization level(s): 2, 3, 4
Default: +Omoveflops
Allows [or disallows] moving conditional floating point instructions out of loops. The
+Onomoveflops option replaces the obsolete +OE option. The behavior of floating-point
exception handling may be altered by this option.
Use +Onomoveflops if floating-point traps are enabled and you do not want the behavior of
floating-point exceptions to be altered by the relocation of floating-point instructions.
+O[no]parallel
Optimization level(s): 3, 4
Default: +Onoparallel
When a program is compiled with the +Oparallel option, the compiler looks for
opportunities for parallel execution in loops and generates parallel code to execute the loop
on the number of processors set by the MP_NUMBER_OF_THREADS environment variable
discussed in the section "Parallel Execution" at the end of this chapter.
If a program made of multiple files has any of its files compiled with the +Oparallel
option, then the remaining files must be compiled with either the +Oparallel or
+O[no]parallel_env option. The reason for the +Oparallel_env option is to ensure a
consistent execution environment for all files in the program, including those that you do
not want compiled for parallel execution.
+O[no]parallel (continued)
+Onoloop_transform and +Onoinline may be helpful options if you experience any
problem while using +Oparallel.
You may use +Oparallel at optimization levels 3 and 4. The default is +Onoparallel at
levels 0-4. +Oparallel disables +Ofailsafe.