Jump to content United States-English
HP.com Home Products and Services Support and Drivers Solutions How to Buy
» Contact HP
More options
HP.com home
Parallel Programming Guide for HP-UX Systems > Chapter 5 Loop and cross-module optimization features

Loop distribution

» 

Technical documentation

Complete book in PDF
» Feedback
Content starts here

 » Table of Contents

 » Glossary

 » Index

Loop distribution is another fundamental +O3 transformation necessary for more advanced transformations. These advanced transformations require that all calculations in a nested loop be performed inside the innermost loop. To facilitate this, loop distribution transforms complicated nested loops into several simple loops that contain all computations inside the body of the innermost loop.

Loop distribution takes place at +O3 and above and is enabled by default. Specifying +Onoloop_transform disables loop distribution, as well as loop interchange, loop blocking, loop fusion, loop unroll, and loop unroll and jam.

Loop distribution is disabled for specific loops by specifying the no_distribute directive or pragma immediately before the loop.

The form of this directive and pragma is shown in Table 5-6 “Form of no_distribute directive and pragma”.

Table 5-6 Form of no_distribute directive and pragma

LanguageForm
FortranC$DIR NO_DISTRIBUTE
C#pragma _CNX no_distribute

 

Example 5-12 Loop distribution

This example begins with the following Fortran code:

DO I = 1, N
C(I) = 0
DO J = 1, M
A(I,J) = A(I,J) + B(I,J) * C(I)
ENDDO
ENDDO

Loop distribution creates two copies of the I loop, separating the nested J loop from the assignments to array C. In this way, all assignments are moved to innermost loops. Interchange is then performed on the I and J loops.

The distribution and interchange is shown in the following transformed code:

DO I = 1, N
C(I) = 0
ENDDO
DO J = 1, M
DO I = 1, N
A(I,J) = A(I,J) + B(I,J) * C(I)
ENDDO
ENDDO

Distribution can improve efficiency by reducing the number of memory references per loop iteration and the amount of cache thrashing. It also creates more opportunities for interchange.

Printable version
Privacy statement Using this site means you accept its terms Feedback to webmaster
© Hewlett-Packard Development Company, L.P.