| United States-English |
|
|
|
![]() |
Parallel Programming Guide for HP-UX Systems: K-Class and V-Class Servers > Chapter 6 Parallel optimization featuresInhibiting parallelization |
|
Certain constructs, such as loop-carried dependences, inhibit parallelization. Other types of constructs, such as procedure calls and I/O statements, inhibit parallelism for the same reason they inhibit localization. An exception to this is that more categories of loop-carried dependences can inhibit parallelization than data localization. This is described in the following sections. The specific loop-carried dependences (LCDs) that inhibit data localization represent a very small portion of all loop-carried dependences. A much broader set of LCDs inhibits parallelization. Examples of various parallel-inhibiting LCDs follows. Parallel-inhibiting LCDs One type of LCD exists when one iteration references a variable whose value is assigned on a later iteration. The Fortran loop below contains this type of LCD on the array A.
In this example, the first iteration assigns a value to A(1) and references A(2). The second iteration assigns a value to A(2) and references A(3). The reference to A(I) depends on the fact that the I+1th iteration, which assigns a new value to A(I), has not yet executed. Forward LCDs inhibit parallelization because if the loop is broken up to run on several processors, when I reaches its terminal value on one processor, A(I+1) has usually already been computed by another processor. It is, in fact, the first value computed by another processor. Because the calculation depends on A(I+1) not being computed yet, this would produce wrong answers. Parallel-inhibiting LCDs Another type of LCD exists when one iteration references a variable whose value was assigned on an earlier iteration.The Fortran loop below contains a backward LCD on the array A.
Here, each iteration assigns a value to A based on the value assigned to A in the previous iteration. If A(I-1) has not been computed before A(I) is assigned, wrong answers result. Backward LCDs inhibit parallelism because if the loop is broken up to run on several processors, A(I-1) are not computed for the first iteration of the loop on every processor except the processor running the chunk of the loop containing I = 1. Output LCDs An output LCD exists when the same memory location is assigned values on two or more iterations. A potential output LCD exists when the compiler cannot determine whether an array subscript contains the same values between loop iterations. The Fortran loop below contains a potential output LCD on the array A:
Here, if any referenced elements of J contain the same value, the same element of A is assigned several different elements of B. In this case, as this loop is written, any A elements that are assigned more than once should contain the final assignment at the end of the loop. This cannot be guaranteed if the loop is run in parallel. Apparent LCDs The compiler chooses to not parallelize loops containing apparent LCDs rather than risk wrong answers by doing so. If you are sure that a loop with an apparent LCD is safe to parallelize, you can indicate this to the compiler using the no_loop_dependence directive or pragma, which is explained in the section “Loop-carried dependences (LCDs)”. The following Fortran example illustrates a NO_LOOP_DEPENDENCE directive being used on the output LCD example presented previously:
This effectively tells the compiler that no two elements of J are identical, so there is no output LCD and the loop is safe to parallelize. If any of the J values are identical, wrong answers could result. |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|||||||||||||||