Jump to content United States-English
HP.com Home Products and Services Support and Drivers Solutions How to Buy
» Contact HP
More options
HP.com home
Parallel Programming Guide for HP-UX Systems: K-Class and V-Class Servers > Chapter 4 Standard optimization features

Block level optimizations (+O1)

» 

Technical documentation

Complete book in PDF
» Feedback
Content starts here

 » Table of Contents

 » Glossary

At optimization level +O1, the compiler performs optimizations on a block level. The compiler continues to run the +O0 optimizations, with the following additions:

Branch optimization

Branch optimization involves traversing the procedure and transforming branch instruction sequences into more efficient sequences where possible. Examples of possible transformations are:

  • Deleting branches whose target is the fall-through instruction (the target is two instructions away)

  • Changing the target of the first branch to be the target of the second (unconditional) branch when the target of a branch is an unconditional branch

  • Transforming an unconditional branch at the bottom of a loop that branches to a conditional branch at the top of the loop into a conditional branch at the bottom of the loop

  • Changing an unconditional branch to the exit of a procedure into an exit sequence where possible

  • Changing conditional or unconditional branch instructions that branch over a single instruction into a conditional nullification in the following instruction

  • Looking for conditional branches over unconditional branches, where the sense of the first branch could be inverted and the second branch deleted. These result from null THEN clauses and from THEN clauses that only contain GOTO statements .

Conditional/unconditional branches

The following Fortran example provides a transformation from a branch instruction to a more efficient sequence:

      IF (L) THEN
A=A*2
ELSE
GOTO 100
ENDIF
B=A+1
100 C=A*10

becomes:

      IF (.NOT. L) GOTO 100
A=A*2
B=A+1
100 C=A*10

Dead code elimination

Dead code elimination removes unreachable code that is never executed.

For example, in C:

if(0) 
a = 1;
else
a = 2;

becomes:

  a = 2;

Faster register allocation

Faster register allocation involves:

  • Inserting entry and exit code

  • Generating code for operations such as multiplication and division

  • Eliminating unnecessary copy instructions

  • Allocating actual registers to the dummy registers in instructions

Faster register allocation, when used at +O0 or +O1, analyzes register use faster than the global register allocation performed at +O2.

Instruction scheduling

The instruction scheduler optimization performs the following tasks:

  • Reorders the instructions in a basic block to improve memory pipelining. For example, where possible, a load instruction is separated from the use of the loaded register.

  • Follows a branch instruction with an instruction that is executed as the branch occurs, where possible.

  • Schedules floating-point instructions.

Peephole optimizations

A peephole optimization is a machine-dependent optimization that makes a pass through low-level assembly-like instruction sequences of the program. It applies patterns to a small window (peephole) of code looking for optimization opportunities. It performs the following optimizations:

  • Changes the addressing mode of instructions so they use shorter sequences

  • Replaces low-level assembly-like instruction sequences with faster (usually shorter) sequences and removes redundant register loads and stores

Printable version
Privacy statement Using this site means you accept its terms Feedback to webmaster
© Hewlett-Packard Development Company, L.P.