| United States-English |
|
|
|
![]() |
Parallel Programming Guide for HP-UX Systems: K-Class and V-Class Servers > Chapter 4 Standard optimization featuresMachine instruction level optimizations (+O0) |
|
At optimization level +O0, the compiler performs optimizations that span only a single source statement. This is the default. The +O0 machine instruction level optimizations include:
Constant folding is the replacement of operations on constants with the result of the operation. For example, Y=5+7 is replaced with Y=12. More advanced constant folding is performed at optimization level +O2. See the section “Advanced constant folding and propagation” for more information. Where possible, the compiler determines the truth value of a logical expression without evaluating all the operands. This is known as short-circuiting. The Fortran example below describes this: IF ((I .EQ. J) .OR. (I .EQ. K)) GOTO 100 If (I .EQ. J) is true, control immediately goes to 100; otherwise, (I .EQ. K) must be evaluated before control can go to 100 or the following statement. Do not rely upon partial evaluation if you use function calls in the logical expression because:
The compiler may place frequently used variables in registers to avoid more costly accesses to memory. A more advanced register assignment algorithm is used at optimization level +O2. See the section “Global register allocation (GRA)” for more information. The compiler automatically aligns data objects to their natural boundaries in memory, providing more efficient access to data. This means that a data object's address is integrally divisible by the length of its data type; for example, REAL*8 objects have addresses integrally divisible by 8 bytes.
Declare scalar variables in order from longest to shortest data length to ensure the efficient layout of such aligned data in memory. This minimizes the amount of padding the compiler has to do to get the data onto its natural boundary. Data alignment on natural boundaries The following Fortran example describes the alignment of data objects to their natural boundaries:
Here, the compiler must insert 6 unused bytes after BOOL in order to correctly align A, and it must insert 4 unused bytes after C to correctly align D. The same data is more efficiently ordered as shown in the following example:
Natural boundary alignment is performed on all data. This is not to be confused with cache line boundary alignment. |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|||||||||||||||