Jump to content United States-English
HP.com Home Products and Services Support and Drivers Solutions How to Buy
» Contact HP
More options
HP.com home
Parallel Programming Guide for HP-UX Systems > Chapter 5 Loop and cross-module optimization features

Loop interchange

» 

Technical documentation

Complete book in PDF
» Feedback
Content starts here

 » Table of Contents

 » Glossary

 » Index

The compiler may interchange (or reorder) nested loops for the following reasons:

  • To facilitate other transformations

  • To relocate the loop that is the most profitable to parallelize so that it is outermost

  • To optimize inner-loop memory accesses

Loop interchange takes place at +O3 and above and is enabled by default. Specifying +Onoloop_transform disables loop interchange, as well as loop distribution, loop blocking, loop fusion, loop unroll, and loop unroll and jam.

Example 5-16 Loop interchange

This example begins with the Fortran matrix addition algorithm below:

DO I = 1, N
DO J = 1, M
A(I, J) = B(I, J) + C(I, J)
ENDDO
ENDDO

The loop accesses the arrays A, B and C row by row, which, in Fortran, is very inefficient. Interchanging the I and J loops, as shown in the following example, facilitates column by column access.

DO J = 1, M
DO I = 1, N
A(I, J) = B(I, J) + C(I, J)
ENDDO
ENDDO

Unlike Fortran, C and C++ access arrays in row-major order. An analogous example in C and C++, then, employs an opposite nest ordering, as shown below.

for(j=0;j<m;j++)
for(i=0;i<n;i++)
a[i][j] = b[i][j] + c[i][j];

Interchange facilitates row-by-row access. The interchanged loop is shown below.

for(i=0;i<n;i++)
for(j=0;j<m;j++)
a[i][j] = b[i][j] + c[i][j];
Printable version
Privacy statement Using this site means you accept its terms Feedback to webmaster
© Hewlett-Packard Development Company, L.P.