Jump to content United States-English
HP.com Home Products and Services Support and Drivers Solutions How to Buy
» Contact HP
More options
HP.com home
Parallel Programming Guide for HP-UX Systems: K-Class and V-Class Servers > Chapter 6 Parallel optimization features

Reductions

» 

Technical documentation

Complete book in PDF
» Feedback
Content starts here

 » Table of Contents

 » Glossary

In many cases, the compiler can recognize and parallelize loops containing a special class of dependence known as a reduction. In general, a reduction has the form:

X = X operator Y

where

X

is a variable not assigned or used elsewhere in the loop, Y is a loop constant expression not involving X, and operator is +, *, .AND., .OR., or .XOR.

The compiler also recognizes reductions of the form:

X = function(X,Y)

where

X

is a variable not assigned or referenced elsewhere in the loop, Y is a loop constant expression not involving X, and function is the intrinsic MAX function or intrinsic MIN function.

Generally, the compiler automatically recognizes reductions in a loop and is able to parallelize the loop. If the loop is under the influence of the prefer_parallel directive or pragma, the compiler still recognizes reductions.

However, in a loop being manipulated by the loop_parallel directive or pragma, reduction analysis is not performed. Consequently, the loop may not be correctly parallelized unless the reduction is enforced using the reduction directive or pragma.

The form of this directive and pragma is shown in Table 6-3 “Form of reduction directive and pragma”.

Table 6-3 Form of reduction directive and pragma

LanguageForm
Fortran

C$DIR REDUCTION

C

#pragma _CNX reduction

 

Reduction

Reductions commonly appear in the form of sum operations, as shown in the following Fortran example:

DO I = 1, N
A(I) = B(I) + C(I)
.
.
.
ASUM = ASUM + A(I)
ENDDO

Assuming this loop does not contain any parallelization-inhibiting code, the compiler would automatically parallelize it. The code generated to accomplish this creates temporary, thread-specific copies of ASUM for each thread that runs the loop. When each parallel thread completes its portion of the loop, thread 0 for the current spawn context accumulates the thread-specific values into the global ASUM.

The following Fortran example shows the use of the reduction directive on the above code. loop_parallel is described on loop_parallel. loop_private is described on loop_private.

C$DIR LOOP_PARALLEL, LOOP_PRIVATE(FUNCTEMP), REDUCTION(SUM)
DO I = 1, N
.
.
.
FUNCTEMP = FUNC(X(I))
SUM = SUM + FUNCTEMP
.
.
.
ENDDO
Printable version
Privacy statement Using this site means you accept its terms Feedback to webmaster
© Hewlett-Packard Development Company, L.P.