article

SmartApps: middle-ware for adaptive applications on reconfigurable platforms

Authors:

Lawrence Rauchwerger,

Nancy M. AmatoAuthors Info & Claims

ACM SIGOPS Operating Systems Review, Volume 40, Issue 2

Pages 73 - 82

https://doi.org/10.1145/1131322.1131338

Published: 01 April 2006 Publication History

Abstract

One general avenue to obtain optimized performance on large and complex systems is to approach optimization from a global perspective of the complete system in a customized manner for each application, i.e., application-centric optimization. Lately, there have been encouraging developments in reconfigurable operating systems and hardware that will enable customized optimization. For example, machines built with PIM's and FPGA's can be quickly reconfigured to better fit a certain application and operating systems, such as IBM's K42, can have their services customized to fit the needs and characteristics of an application. While progress in operating system and hardware and hardware has made re-configuration possible, we still need strategies and techniques to exploit them for improved application performance.In this paper, we describe the approach we are using in our smart application (SMARTAPPS) project. In the SMARTAPP executable, the compiler embeds most run-time system services and a feedback loop to monitor performance and trigger run-time adaptations. At run-time, after incorporating the code's input and determining the system's state, the SMARTAPP performs an instance specific optimization. During execution, the application continually monitors its performance and the available resources to determine if restructuring should occur. The framework includes mechanisms for performing the actual restructuring at various levels including: algorithmic adaptation, tuning reconfigurable OS services (scheduling policy, page size, etc.), and system configuration (e.g., number of processors). This paper concentrates on the techniques for providing customized system services for communication, thread scheduling, memory management, and performance monitoring and modeling.

References

[1]

The CHARM++ Programming Language Manual. http://charm.cs.uiuc.edu, 2000.]]

[2]

P. An, et al. STAPL: A standard template adaptive parallel C++ library. In Proc. of the Int. Workshop on Advanced Compiler Technology for High Performance and Embedded Processors, Bucharest, Romania, Jul. 2001.]]

[3]

J. Appavo, et. al. Experience with k42, an open source, linux-compatible, scalable operating-system kernel. IBM Syst. Journal, 44(2), 2005.]]

Digital Library

[4]

P. Beckman and D. Gannon. Tulip: A portable run-time system for object-parallel systems. In Int. Parallel Processing Symp., pp. 532--536, 1996.]]

Digital Library

[5]

G. Blelloch. NESL: A Nested Data-Parallel Language. Tech. Rep. CMU-CS-93-129, Carnegie Mellon Univ., April 1993.]]

Digital Library

[6]

E. Brewer. High-level optimization via automated statistical modeling. In Proc. ACM SIGPLAN Symp. Prin. Prac. Par. Prog. (PPoPP), pp. 80--91, 1995.]]

Digital Library

[7]

Calin Cascaval, Evelyn Duesterwald, Peter F. Sweeney, and Robert W. Wisniewski. Multiple page size modeling and optimization. In Proc. Intern. Conf. Parallel Architecture and Compilation Techniques (PACT), 2005.]]

Digital Library

[8]

C. Chang, A. Sussman, and J. Saltz. Object-oriented runtime support for complex distributed data structures. Technical Report CR-TR-3438, University of Maryland, Department of Computer Science, 1995.]]

Digital Library

[9]

D. Culler, et. al. Parallel programming in Split-C. In Int. Conf. on Supercomputing, Nov. 1993.]]

Digital Library

[10]

F. Dang and L. Rauchwerger. Speculative parallelization of partially parallel loops. In Proc. of the 5th Int. Workshop, Languages, Compilers and Run-time Systems for Scalable Computing, May 2000.]]

Digital Library

[11]

E. Deelman, et. al. POEMS: End-to-end performance design of large parallel adaptive computational systems. In Proc. of the 1st Int. Workshop on Software and Performance, pp. 18--30, New York, Oct. 1998. ACM Press.]]

Digital Library

[12]

I. Foster, C. Kesselman, and S. Tuecke. The Nexus approach to integrating multithreading and communication. Journal of Parallel and Distributed Computing, 37(1):70--82, 1996.]]

Digital Library

[13]

M. Frigo, C. Leiserson, and K. Randall. The implementation of the Cilk-5 multithreaded language. In ACM SIGPLAN Conf. on Programming Language Design and Implementation, 1998.]]

Digital Library

[14]

M. Govindaraju, et. al. Requirements for and evaluation of RMI protocols for scientific computing. In High Performance Networking and Computing Conf., pp. 76--102, 2000.]]

Digital Library

[15]

E. Johnson. Support for Parallel Generic Programming. PhD thesis, Indiana Univ., 1998.]]

Digital Library

[16]

E. Johnson and D. Gannon. HPC++: Experiments with the parallel standard library. In Int. Conf. on Supercomputing, 1997.]]

Digital Library

[17]

A. Jula and L. Rauchwerger. Defero memory allocator: A semantic driven memory allocator. Tech. Rep. TR-JR-05, Parasol Lab, Dept. of Computer Science, Texas A&M Univ., Nov. 2005.]]

[18]

L. Kale and S. Krishnan. CHARM++: A portable concurrent object oriented system based on C++. In Conf. on Object-Oriented Programming Systems, Languages and Applications, pp. 91--108, 1993.]]

Digital Library

[19]

L. Kale and S. Krishnan. Charm++: Parallel programming with message-driven objects. In Gregory Wilson and Paul Lu, editors, Parallel Programming using C++, pp. 175--213. Cambridge, MA: MIT Press, 1996.]]

[20]

C. L. Lawson, R. J. Hanson, D. R. Kincaid, and F. T. Krogh. Basic linear algebra subprograms for fortran usage. ACM Trans. Math. Softw., 5(3):308--323, 1979.]]

Digital Library

[21]

X. Li, M. J. Garzaran, and D. Padua. A dynamically tuned sorting library. In Proc. of the Int. Symposium on Code Generation and Optimization, pp. 111--124, March 2004.]]

Digital Library

[22]

M. Olszewski and M. Voss. Proc. of the International Conference on Parallel and Distributed Processing Techniques and Applications, June 21--24, 2004. In Hamid R. Arabnia, editor, PDPTA. CSREA Press, 2004.]]

[23]

M. Puschl et al. SPIRAL: Code Generation for DSP Transforms. Proceedings of the IEEE special issue on Program Generation, Optimization, and Adaptation, 93(2):232--275, 2005.]]

[24]

L. Rauchwerger, N. Amato, and D. Padua. A scalable method for run-time loop parallelization. Int. J. Paral. Prog., 26(6):537--576, July 1995.]]

Digital Library

[25]

L. Rauchwerger and D. Padua. The LRPD Test: Speculative Run-Time Parallelization of Loops with Privatization and Reduction Parallelization. IEEE Trans. on Parallel and Distributed Systems, 10(2), 1999.]]

Digital Library

[26]

L. Rauchwerger and D. Padua. Parallelizing WHILE Loops for Multiprocessor Systems. In Proc. of 9th Int. Parallel Processing Symposium, April 1995.]]

Digital Library

[27]

J. Reynders. Pooma: A framework for scientific simulation on parallel architectures, 1996. In Wilson, G., Lu, P. (Eds.): Parallel Programming using C++. M.I.T. Press, pp. 553--594, 1996.]]

[28]

J. R. Rice. The algorithm selection problem. Advances in Computers, 15:65--118, 1976.]]

[29]

S. Rus and L. Rauchwerger. Hybrid analysis: static & dynamic memory reference analysis. Int. Journal of Parallel Programming, 31(3):251--283, 2003.]]

Digital Library

[30]

T. Sheffler. A portable MPI-based parallel vector template library. Tech. Rep. RIACS-TR-95.04, Research Inst. for Advanced Computer Science, March 1995.]]

[31]

N. Thomas, Gabriel Tanase, Olga Tkachyshyn, Jack Perdue, Nancy M. Amato, and L. Rauchwerger. A framework for adaptive algorithm selection in STAPL. In Proc. ACM SIGPLAN Symp. Prin. Prac. Par. Prog. (PPoPP), pp. 277--288, 2005.]]

Digital Library

[32]

D. Vallejo, C. V. Jones, and N. M. Amato. An adaptive framework for 'single shot' motion planning. In Proc. IEEE Int. Conf. Intel. Rob. Syst. (IROS), pp. 1722--1727, 2000.]]

[33]

T. von Eicken, D. Culler, S. Goldstein, and K. Schauser. Active messages: A mechanism for integrated communication and computation. In Int. Symp. on Computer Architecture, pp. 256--266, 1992.]]

Digital Library

[34]

R. Vuduc, J. Demmel, and J. Bilmes. Statistical models for empirical search-based performance tuning. Int. Journal of High Performance Computing Applications, 18(1):65--94, February 2004.]]

Digital Library

[35]

R. Whaley, A. Petitet, and J. Dongarra. Automated empirical optimizations of software and the ATLAS project. Parallel Computing, 27(1-2):3--35, Jan. 2001.]]

Digital Library

[36]

G. Wilson and P. Lu. Parallel Programming using C++. MIT Press, 1996.]]

Digital Library

[37]

H. Yu and L. Rauchwerger. Adaptive reduction parallelization techniques. In ICS '00: Proc. of the 14th Int. Conf. on Supercomputing, pp. 66--77, New York, NY, USA, 2000. ACM Press.]]

Digital Library

[38]

H. Yu, D. Zhang, and L. Rauchwerger. An adaptive algorithm selection framework. In Proc. of the Parallel Architecture and Compilation Techniques, 13th Int. Conf. on (PACT'04), pp. 278--289. IEEE Computer Society, 2004.]]

Digital Library

[39]

M. Morales, L. Tapia, R. Pearce, S. Rodriguez, and N. M. Amato. A machine learning approach for feature-sensitive motion planning. In Proc. Int. Workshop on Algorithmic Foundations of Robotics (WAFR), Utrecht/Zeist, The Netherlands, July 2004.]]

[40]

M. A. Morales et. al. C-space subdivision and integration in feature-sensitive motion planning. In Proc. IEEE Int. Conf. Robot. Autom. (ICRA), April 2005.]]

[41]

R. Wisniewski, et. al. Performance and Environment Monitoring for Whole-System Characterization and Optimization in Proc. Conf. on Power/Performance interaction with Architecture, Circuits and Compilers 2004.]]

Index Terms

SmartApps: middle-ware for adaptive applications on reconfigurable platforms
1. General and reference
  1. Cross-computing tools and techniques
    1. Performance
2. Software and its engineering

Recommendations

SmartApps: An Application Centric Approach to High Performance Computing: Compiler-Assisted Software and Hardware Support for Reduction Operations
IPDPS '02: Proceedings of the 16th International Parallel and Distributed Processing Symposium

State-of-the-art run-time systems are a poor match to diverse, dynamic distributed applications because they are designed to provide support to a wide variety of applications, without much customization to individual specific requirements. Little or no ...
SmartApps: An Application Centric Approach to High Performance Computing
LCPC '00: Proceedings of the 13th International Workshop on Languages and Compilers for Parallel Computing-Revised Papers

State-of-the-art run-time systems are a poor match to diverse, dynamic distributed applications because they are designed to provide support to a wide variety of applications, without much customization to individual specific requirements. Little or no ...
Outer loop pipelining for application specific datapaths in FPGAs

Most hardware compilers apply loop pipelining to increase the parallelism achieved, but pipelining is restricted to the only innermost level in a nested loop. In this work we extend and adapt an existing outer loop pipelining approach known as single ...

Comments

Information & Contributors

Information

Published In

cover image ACM SIGOPS Operating Systems Review

ACM SIGOPS Operating Systems Review Volume 40, Issue 2

April 2006

107 pages

ISSN:0163-5980

DOI:10.1145/1131322

Issue’s Table of Contents

Copyright © 2006 Authors.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 April 2006

Published in SIGOPS Volume 40, Issue 2

Check for updates

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
338
Total Downloads

Downloads (Last 12 months)1
Downloads (Last 6 weeks)0

Reflects downloads up to 15 Feb 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Issue’s Table of Contents