| United States-English |
|
|
|
![]() |
Parallel Programming Guide for HP-UX Systems > Chapter 6 Parallel
optimization featuresInhibiting parallelization |
|
Certain constructs, such as loop-carried dependences, inhibit parallelization. Other types of constructs, such as procedure calls and I/O statements, inhibit parallelism for the same reason they inhibit localization. An exception to this is that more categories of loop-carried dependences can inhibit parallelization than data localization. This is described in the following sections. The specific loop-carried dependences (LCDs) that inhibit data localization represent a very small portion of all loop-carried dependences. A much broader set of LCDs inhibits parallelization. Examples of various parallel-inhibiting LCDs follows. Example 6-3 Parallel-inhibiting LCDs One type of LCD exists when one iteration references a variable whose value is assigned on a later iteration. The Fortran loop below contains this type of LCD on the array A.
In this example, the first iteration assigns a value to A(1) and references A(2). The second iteration assigns a value to A(2) and references A(3). The reference to A(I) depends on the fact that the I+1th iteration, which assigns a new value to A(I), has not yet executed. Forward LCDs inhibit parallelization because if the loop is broken up to run on several processors, when I reaches its terminal value on one processor, A(I+1) has usually already been computed by another processor. It is, in fact, the first value computed by another processor. Because the calculation depends on A(I+1) not being computed yet, this would produce wrong answers. Example 6-4 Parallel-inhibiting LCDs Another type of LCD exists when one iteration references a variable whose value was assigned on an earlier iteration.The Fortran loop below contains a backward LCD on the array A.
Here, each iteration assigns a value to A based on the value assigned to A in the previous iteration. If A(I-1) has not been computed before A(I) is assigned, wrong answers result. Backward LCDs inhibit parallelism because if the loop is broken up to run on several processors, A(I-1) are not computed for the first iteration of the loop on every processor except the processor running the chunk of the loop containing I = 1. Example 6-5 Output LCDs An output LCD exists when the same memory location is assigned values on two or more iterations. A potential output LCD exists when the compiler cannot determine whether an array subscript contains the same values between loop iterations. The Fortran loop below contains a potential output LCD on the array A:
Here, if any referenced elements of J contain the same value, the same element of A is assigned several different elements of B. In this case, as this loop is written, any A elements that are assigned more than once should contain the final assignment at the end of the loop. This cannot be guaranteed if the loop is run in parallel. Example 6-6 Apparent LCDs The compiler chooses to not parallelize loops containing apparent LCDs rather than risk wrong answers by doing so. If you are sure that a loop with an apparent LCD is safe to parallelize, you can indicate this to the compiler using the no_loop_dependence directive or pragma, which is explained in the section “Loop-carried dependences (LCDs)”. The following Fortran example illustrates a NO_LOOP_DEPENDENCE directive being used on the output LCD example presented previously:
This effectively tells the compiler that no two elements of J are identical, so there is no output LCD and the loop is safe to parallelize. If any of the J values are identical, wrong answers could result. |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|||||||||||||||