Jump to content United States-English
HP.com Home Products and Services Support and Drivers Solutions How to Buy
» Contact HP
More options
HP.com home
Parallel Programming Guide for HP-UX Systems > Chapter 6 Parallel optimization features

Levels of parallelism

» 

Technical documentation

Complete book in PDF
» Feedback
Content starts here

 » Table of Contents

 » Glossary

 » Index

In the HP compilers, parallelism exists at the loop level, task level, and region level, as described in Chapter 9 “Parallel programming techniques”. These are briefly described as follows.

  • HP compilers automatically exploit loop-level parallelism. This type of parallelism involves dividing a loop into several smaller iteration spaces and scheduling these to run simultaneously on the available processors. For more information, see “Parallelizing loops”.

    Using the +Oparallel option at +O3 and above allows the compiler to automatically parallelize loops that are profitable to parallelize.

NOTE: Only loops with iteration counts that can be determined prior to loop invocation at runtime are candidates for parallelization. Loops with iteration counts that depend on values or conditions calculated within the loop cannot be parallelized by any means.
  • Specify task-level parallelism using the begin_tasks, next_task and end_tasks directives and pragmas, as discussed in the section “Parallelizing tasks”.

  • Specify parallel regions using the parallel and end_parallel directives and pragmas, as discussed in the section “Parallelizing regions”. These directives and pragmas allow the compiler to run identified sections of code in parallel.

Loop-level parallelism

HP compilers locate parallelism at the loop level, generating parallel code that is automatically run on as many processors as are available at runtime. Normally, these are all the processors on the same system where your program is running. You can specify a smaller number of processors using any of the following:

  • loop_parallel(max_threads=m) directive and pragma—available in Fortran and C

  • prefer_parallel(max_threads=m)directive and pragma—available in Fortran and C

    For more information on the loop_parallel and prefer_parallel directives and pragmas see Chapter 9 “Parallel programming techniques”.

  • MP_NUMBER_OF_THREADS environment variable—This variable is read at runtime by your program. If this variable is set to some positive integer n, your program executes on n processors. n must be less than or equal to the number of processors in the system where the program is executing.

Automatic parallelization

Automatic parallelization is useful for programs containing loops. You can use compiler directives or pragmas to improve on the automatic optimizations and to assist the compiler in locating additional opportunities for parallelization.

If you are writing your program entirely under the message-passing paradigm, you must explicitly handle parallelism as discussed in the HP MPI User’s Guide.

Example 6-1 Loop-level parallelism

This example begins with the following Fortran code:

PROGRAM PARAXPL
.
.
.
DO I = 1, 1024
A(I) = B(I) + C(I)
.
.
.
ENDDO

Assuming that the I loop does not contain any parallelization-inhibiting code, this program can be parallelized to run on eight processors by running 128 iterations per processor (1024 iterations divided by 8 processors = 128 iterations each). One processor would run the loop for
I
= 1 to 128. The next processor would run I = 129 to 256, and so on. The loop could similarly be parallelized to run on any number of processors, with each one taking its appropriate share of iterations.

At a certain point, however, adding more processors does not improve performance. The compiler generates code that runs on as many processors as are available, but the dynamic selection optimization (described in the section “Dynamic selection”) ensures that parallel code is executed only if it is profitable to do so. If the number of available processors does not evenly divide the number of iterations, some processors perform fewer iterations than others.

Printable version
Privacy statement Using this site means you accept its terms Feedback to webmaster
© Hewlett-Packard Development Company, L.P.