ACM Home Page
Please provide us with feedback. Feedback
A general approach for partitioning N-dimensional parallel nested loops with conditionals
Full text PdfPdf (533 KB)
Source ACM Symposium on Parallel Algorithms and Architectures archive
Proceedings of the eighteenth annual ACM symposium on Parallelism in algorithms and architectures table of contents
Cambridge, Massachusetts, USA
SESSION: Compilers, supercomputing and quantum computing table of contents
Pages: 49 - 58  
Year of Publication: 2006
ISBN:1-59593-452-9
Authors
Arun Kejariwal  University of California at Irvine, Irvine, CA
Alexandru Nicolau  University of California at Irvine, Irvine, CA
Hideki Saito  Intel Corporation, Santa Clara, CA
Xinmin Tian  Intel Corporation, Santa Clara, CA
Milind Girkar  Intel Corporation, Santa Clara, CA
Utpal Banerjee  Intel Corporation, Santa Clara, CA
Constantine D. Polychronopoulos  University of Illinois at Urbana-Champaign, Urbana, IL
Sponsors
ACM: Association for Computing Machinery
SIGACT: ACM Special Interest Group on Algorithms and Computation Theory
SIGARCH: ACM Special Interest Group on Computer Architecture
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 0,   Downloads (12 Months): 59,   Citation Count: 1
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
Save this Article to a Binder    Display Formats: BibTex  EndNote ACM Ref   
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1148109.1148117
What is a DOI?

ABSTRACT

Parallel loops account for the greatest amount of parallelism in scientific and numerical codes. For example, most of the DO loops in SPEC CFP2000 and SPEC OMPM2001 are of DOALL type and account for a large percentage of the total execution time. One of the ways to exploit parallelism is to partition the iteration space of a DOALL loop amongst different processors in a parallel processor system. Naturally, a good partitioning is of key importance to achieve high performance and for efficient use of multiprocessor systems. Although a significant amount of work has been done in partitioning and scheduling of loops with both rectangular and non-rectangular iteration spaces, the problem of partitioning loops with conditionals has not been addressed so far to the best of our knowledge. In this paper, we present a mathematical model for partitioning parallel nested loops, both perfect and non-perfect, with conditionals, where the expressions in a conditional are affine functions of the outer loop indices. We present a loop transformation based on elimination of redundant constraints bounding the iteration space of a nested loop. The transformation plays a critical role during the (static) partitioning process as it helps to capture the "exact" lower and upper bounds (can be either a constant or symbolic) of the loop indices. We generate a canonical form of the loop nest using the transformation and employ the geometric approach we proposed earlier (in [1, 2]) for partitioning the iteration space along an axis corresponding to the outermost loop. For cases in which such a transformation does not exist, we propose a general approach for loop canonicalization. We present several examples from the literature and numerical packages to illustrate the effectiveness of our approach.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
 
2
A. Kejariwal, P. D'Alberto, A. Nicolau, and C. D. Polychronopoulos. A geometric approach for partitioning N-dimensional non-rectangular iteration spaces. In Proceedings of the 17th International Workshop on Languages and Compilers for Parallel Computing, pages 102--116, West Lafayette, IN, 2004.
3
 
4
R. Sakellariou. On the Quest for Perfect Load Balance in Loop-Based Parallel Computations. PhD thesis, Department of Computer Science, University of Manchester, October 1996.
 
5
C. Polychronopoulos, D. J. Kuck, and D. A. Padua. Execution of parallel loops on parallel processor systems. In Proceedings of the 1986 International Conference on Parallel Processing, pages 519--527, August 1986.
 
6
 
7
S. Lundstrom and G. Barnes. A controllable MIMD architectures. In Proceedings of the 1980 International Conference on Parallel Processing, St. Charles, IL, August 1980.
 
8
C. Polychronopoulos. Loop coalescing: A compiler transformation for parallel machines. In Proceedings of the 1987 International Conference on Parallel Processing, pages 235--242, August 1987.
 
9
 
10
P. Anninos. Computational cosmology: From the early universe to the large scale structure. In Living Reviews in Relativity 4, 2001.
 
11
M. O'Boyle and G. A. Hedayat. Load balancing of parallel affine loops by unimodular transformations. Technical Report UMCS-92-1-1, Department of Computer Science, University of Manchester, January 1992.
12
 
13
 
14
J. B. J. Fourier. Solution d'une question particulière du calcul des inégalités. In Ouevres II, pages 317--328. 1826.
 
15
L. L. Dines. System of linear inequalities. Annals of Mathematics, 20:191--199, 1919.
 
16
L. L. Dines and N. H. McCoy. On linear inequalities. Transactions of the Royal Society of Canada, 27:217--232, 1933.
 
17
T. S. Motzkin. Beitrage zur theorie der linearen Ungleichungen. PhD thesis, University of Basel, 1936.
 
18
H. W. Kuhn. Solvability and consistency of linear inequalities and inequalities. American Mathematical Monthly, 63:217--232, 1956.
 
19
S. N. Chernikov. The solution of linear programming problems by elimination of unknowns. Doklady Akademii Nauk SSSR, 139:1314--1317, 1961.
 
20
G. Dantzig. Linear Programming and Extensions. Princeton University Press, Princeton, NJ, 1963.
 
21
G. B. Dantzig and B. C. Eaves. Fourier-Motzkin elimination and its dual. Journal of Combinatorial Theory (A), 14(3):288--297, 1973.
 
22
R. J. Duffin. On Fourier's analysis of linear inequality systems. In Mathematical Programming Study 1, pages 71--95. North-Holland, 1974.
 
23
H. P. Williams. Fourier-motzkin elimination extension to integer programming problems. Journal of Combinatorial Theory (A), 21(1):118--123, 1976.
 
24
 
25
A. Bik and H. Wijshoff. Implementation of Fourier-Motzkin elimination. Technical Report TR94-42, Department of Computer Science, University of Leiden, The Netherlands, 1994.
26
 
27
LEDAS Geometric Solver. http://lgs.ledas.com/features.php.
 
28
D. Kuck, A. H. Sameh, R. Cytron, A. Veidenbaum, C. D. Polychronopoulos, G. Lee, T. McDaniel, B. R. Leasure, C. Beckman, J. R. B Davies, and C. P. Kruskal. The effects of program restructuring, algorithm change and architecture choice on program performance. In Proceedings of the 1984 International Conference on Parallel Processing, pages 129--138, August 1984.
 
29
30
31
 
32
 
33
 
34
 
35
36
 
37
R. Triolet. Interprocedural analysis for program restructuring with Parafrase. CSRD Rpt. No. 538, Department of Computer Science, University of Illinois at Urbana-Champaign, December 1985.
38
39
40


Collaborative Colleagues:
Arun Kejariwal: colleagues
Alexandru Nicolau: colleagues
Hideki Saito: colleagues
Xinmin Tian: colleagues
Milind Girkar: colleagues
Utpal Banerjee: colleagues
Constantine D. Polychronopoulos: colleagues