Abstract
Algorithms in new application areas like machine learning and network analysis use "irregular" data structures such as graphs, trees and sets. Writing efficient parallel code in these problem domains is very challenging because it requires the programmer to make many choices: a given problem can usually be solved by several algorithms, each algorithm may have many implementations, and the best choice of algorithm and implementation can depend not only on the characteristics of the parallel platform but also on properties of the input data such as the structure of the graph. One solution is to permit the application programmer to experiment with different algorithms and implementations without writing every variant from scratch. Auto-tuning to find the best variant is a more ambitious solution. These solutions require a system for automatically producing efficient parallel implementations from high-level specifications. Elixir, the system described in this paper, is the first step towards this ambitious goal. Application programmers write specifications that consist of an operator, which describes the computations to be performed, and a schedule for performing these computations. Elixir uses sophisticated inference techniques to produce efficient parallel code from such specifications.
We used Elixir to automatically generate many parallel implementations for three irregular problems: breadth-first search, single source shortest path, and betweenness-centrality computation. Our experiments show that the best generated variants can be competitive with handwritten code for these problems from other research groups; for some inputs, they even outperform the handwritten versions.
Supplemental Material
Available for Download
The file contains: * .tex sources files - the main one is res0101-prountzos.tex * gfeds.sty * sigplanconf.cls * .pdf figure files * experiments sub-directory containing figure files
- 9th DIMACS Implementation Challenge. http://www.dis.uniroma1.it/~challenge9/download.shtml, 2009.Google Scholar
- Galois system. http://iss.ices.utexas.edu/?p=projects/galois, 2011.Google Scholar
- D. Bader., J. Gilbert, J. Kepner, and K. Madduri. Hpcs scalable synthetic compact applications graph analysis (SSCA2) benchmark v2.2, 2007. http://www.graphanalysis.org/benchmark/.Google Scholar
- D. A. Bader, S. Kintali, K. Madduri, and M. Mihail. Approximating betweenness centrality. In WAW, 2007. Google ScholarDigital Library
- D. A. Bader and K. Madduri. Parallel algorithms for evaluating centrality indices in real-world networks. In ICPP, 2006. Google ScholarDigital Library
- M. Barnett, B. E. Chang, R. DeLine, B. Jacobs, and K. Rustan~M. Leino. Boogie: A modular reusable verifier for object-oriented programs. In FMCO, 2005. Google ScholarDigital Library
- U. Brandes. A faster algorithm for betweenness centrality. Journal of Mathematical Sociology, 25:163--177, 2001.Google ScholarCross Ref
- H. Bunke, T. Glauser, and T. Tran. An efficient implementation of graph grammars based on the RETE matching algorithm. In Graph Grammars and Their Application to Computer Science, 1991. Google ScholarDigital Library
- J. Cai and R. Paige. Program derivation by fixed point computation. Sci. Comput. Program., 11(3), 1989. Google ScholarDigital Library
- D. Chakrabarti, Y. Zhan, and C. Faloutsos. R-MAT: A recursive model for graph mining. In In SIAM Data Mining, 2004.Google ScholarCross Ref
- T. Cormen, C. Leiserson, R. Rivest, and C. Stein, editors. Introduction to Algorithms. MIT Press, 2001. Google ScholarDigital Library
- L. De~Moura and N. Bjørner. Z3: an efficient SMT solver. In TACAS, 2008.Google ScholarCross Ref
- C. Demetrescu, I. Finocchi, and A. Ribichini. Reactive imperative programming with dataflow constraints. In OOPSLA, 2011. Google ScholarDigital Library
- R. Geiß, G. Batz, D. Grund, S. Hack, and A. Szalkowski. Grgen: A fast spo-based graph rewriting tool. In Graph Transformations, 2006. Google ScholarDigital Library
- A. H. Ghamarian, A. Jalali, and A. Rensink. Incremental pattern matching in graph-based state space exploration. In GraBaTs, 2010.Google Scholar
- S. Hong, H. Chafi, E. Sedlar, and K. Olukotun. Green-marl: a DSL for easy and efficient graph analysis. In ASPLOS, 2012. Google ScholarDigital Library
- S. Itzhaky, S. Gulwani, N. Immerman, and M. Sagiv. A simple inductive synthesis methodology and its applications. In OOPSLA, 2010. Google ScholarDigital Library
- M. Kulkarni, M. Burtscher, C. Cascaval, and K. Pingali. Lonestar: A suite of parallel irregular programs. In ISPASS, 2009.Google ScholarCross Ref
- C. E. Leiserson and T. B. Schardl. A work-efficient parallel breadth-first search algorithm (or how to cope with the nondeterminism of reducers). In SPAA, 2010. Google ScholarDigital Library
- A. Lenharth, D. Nguyen, and K. Pingali. Priority queues are not good concurrent priority schedulers. Technical Report TR-11--39, UT Austin, Nov 2011.Google Scholar
- Y. A. Liu and S. D. Stoller. From datalog rules to efficient programs with time and space guarantees. ACM TOPLAS, 31(6), 2009. Google ScholarDigital Library
- U. Meyer and P. Sanders. Δ-stepping: A parallelizable shortest path algorithm. J. Algorithms, 49(1):114--152, 2003. Google ScholarDigital Library
- D. Nguyen and K. Pingali. Synthesizing concurrent schedulers for irregular algorithms. In ASPLOS, 2011. Google ScholarDigital Library
- R. Paige and S. Koenig. Finite differencing of computable expressions. ACM TOPLAS, 4(3):402--454, 1982. Google ScholarDigital Library
- K. Pingali, D. Nguyen, M. Kulkarni, M. Burtscher, M. A. Hassaan, R. Kaleem, T. H. Lee, A. Lenharth, R. Manevich, M. Méndez-Lojo, D. Prountzos, and X. Sui. The TAO of parallelism in algorithms. In PLDI, 2011. Google ScholarDigital Library
- D. Plump. The graph programming language GP. In CAI, 2009. Google ScholarDigital Library
- D. Prountzos, R. Manevich, and K. Pingali. Elixir: A system for synthesizing concurrent graph programs. Technical Report TR-12--16, UT Austin, Aug 2012.Google ScholarDigital Library
- Y. Pu, R. Bodik, and S. Srivastava. Synthesis of first-order dynamic programming algorithms. In OOPSLA, 2011. Google ScholarDigital Library
- M. Püschel, J. M. F. Moura, J. Johnson, D. Padua, M. Veloso, B. Singer, J. Xiong, F. Franchetti, A. Gacic, Y. Voronenko, K. Chen, R. W. Johnson, and N. Rizzolo. SPIRAL: Code generation for DSP transforms. Proceedings of the IEEE, 2005.Google ScholarCross Ref
- G. Rozenberg, editor. Handbook of graph grammars and computing by graph transformation: Volume I. Foundations. World Scientific Publishing Co., Inc., 1997. Google ScholarDigital Library
- A. Schürr, A. J. Winter, and A. Zündorf. Handbook of graph grammars and computing by graph transformation. chapter The PROGRES approach: language and environment. World Scientific Publishing Co., Inc., 1999.Google Scholar
- A. Solar-Lezama, C.G. Jones, and R. Bodik. Sketching concurrent data structures. In PLDI, 2008. Google ScholarDigital Library
- S. Srivastava, S. Gulwani, and J. Foster. From program verification to program synthesis. In POPL, 2010. Google ScholarDigital Library
- Y. Tang, R. A. Chowdhury, B. C. Kuszmaul, C.K. Luk, and C. E. Leiserson. The Pochoir stencil compiler. In SPAA, 2011. Google ScholarDigital Library
- M. Vechev and E. Yahav. Deriving linearizable fine-grained concurrent objects. In PLDI, 2008. Google ScholarDigital Library
- M. Vechev, E. Yahav, and G. Yorsh. Abstraction-guided synthesis of synchronization. In POPL, 2010. Google ScholarDigital Library
Index Terms
- Elixir: a system for synthesizing concurrent graph programs
Recommendations
Elixir: a system for synthesizing concurrent graph programs
OOPSLA '12: Proceedings of the ACM international conference on Object oriented programming systems languages and applicationsAlgorithms in new application areas like machine learning and network analysis use "irregular" data structures such as graphs, trees and sets. Writing efficient parallel code in these problem domains is very challenging because it requires the ...
The tao of parallelism in algorithms
PLDI '11For more than thirty years, the parallel programming community has used the dependence graph as the main abstraction for reasoning about and exploiting parallelism in "regular" algorithms that use dense arrays, such as finite-differences and FFTs. In ...
Structure-driven optimizations for amorphous data-parallel programs
PPoPP '10Irregular algorithms are organized around pointer-based data structures such as graphs and trees, and they are ubiquitous in applications. Recent work by the Galois project has provided a systematic approach for parallelizing irregular applications ...
Comments