skip to main content
article

Online performance auditing: using hot optimizations without getting burned

Published: 11 June 2006 Publication History

Abstract

As hardware complexity increases and virtualization is added at more layers of the execution stack, predicting the performance impact of optimizations becomes increasingly difficult. Production compilers and virtual machines invest substantial development effort in performance tuning to achieve good performance for a range of benchmarks. Although optimizations typically perform well on average, they often have unpredictable impact on running time, sometimes degrading performance significantly. Today's VMs perform sophisticated feedback-directed optimizations, but these techniques do not address performance degradations, and they actually make the situation worse by making the system more unpredictable.This paper presents an online framework for evaluating the effectiveness of optimizations, enabling an online system to automatically identify and correct performance anomalies that occur at runtime. This work opens the door for a fundamental shift in the way optimizations are developed and tuned for online systems, and may allow the body of work in offline empirical optimization search to be applied automatically at runtime. We present our implementation and evaluation of this system in a product Java VM.

References

[1]
A.-R. Adl-Tabatabai, J. Bharadwaj, D.-Y. Chen, A. Ghuloum, V. Menon, B. Murphy, M. Serrano, and T. Shpeisman. The Star-JIT compiler: A dynamic compiler for managed runtime environments. Intel Technology Journal, 7(1):19--31, Feb. 2003.
[2]
A.-R. Adl-Tabatabai, R. L. Hudson, M. J. Serrano, and S. Subramoney. Prefetch injection based on hardware monitoring and object metadata. ACM SIGPLAN Notices, 39(6):267--276, June 2004. In Conference on Programming Language Design and Implementation (PLDI).
[3]
F. Agakov, E. Bonilla, J. Cavazos, B. Franke, G. Fursin, M. O'Boyle, J. Thomson, M. Toussaint, and C. Williams. Using machine learning to focus iterative optimization. In The International Symposium on Code Generation and Optimization, 2006.
[4]
A. W. Appel. Simple generational garbage collection and fast allocation. Software-Practice and Experience, 19(2):171--183, Feb. 1989.
[5]
M. Arnold, S. Fink, D. Grove, M. Hind, and P. F. Sweeney. Adaptive optimization in the Jalapeño JVM. ACM SIGPLAN Notices, 35(10):47--65, Oct. 2000. In Conference on Object-Oriented Programming, Systems, Languages and Applications (OOPSLA).
[6]
M. Arnold, S. J. Fink, D. Grove, M. Hind, and P. F. Sweeney. A survey of adaptive optimization in virtual machines. Proceedings of the IEEE, 93(2), 2005. Special issue on Program Generation, Optimization, and Adaptation.
[7]
M. Arnold and B. G. Ryder. A framework for reducing the cost of instrumented code. ACMSIGPLAN Notices, 36(5):168--179, May 2001. In Conference on Programming Language Design and Implementation (PLDI).
[8]
D. F. Bacon, R. Konuru, C. Murthy, and M. Serrano. Thin locks: Featherweight synchronization for Java. ACM SIGPLAN Notices, 33(5):258--268, May 1998. In Conference on Programming Language Design and Implementation (PLDI).
[9]
J. Bilmes, K. Asanovic, C.-W. Chin, and J. Demmel. Optimizing matrix multiply using PHiPAC: A portable, high-performance, ANSI C coding methodology. In 1997 International Conference on Supercomputing, pages 340--347, 1997.
[10]
J. Cavazos and J. E. B. Moss. Inducing heuristics to decide whether to schedule. ACM SIGPLAN Notices, 39(6):183--194, June 2004. In Conference on Programming Language Design and Implementation (PLDI).
[11]
C. Chen, J. Chame, and M. Hall. Combining models and guided empirical search to optimize for multiple levels of the memory hierarchy. In The International Symposium on Code Generation and Optimization, Mar. 2005.
[12]
H. Chen, J. Lu, W.-C. Hsu, and P.-C. Yew. Continuous adaptive objectcode re-optimization framework. In Ninth Asia-Pacific Computer Systems Architecture, Sept. 2004.
[13]
B. Childers, J. Davidson, and M. L. Soffa. Continuous compilation: A new approach to aggressive and adaptive code transformation. In International Symposium on Parallel and Distributed Processing Symposium, Apr. 2003.
[14]
T. M. Chilimbi and M. Hirzel. Dynamic hot data stream prefetching for general-purpose programs. ACM SIGPLAN Notices, 37(5):199--209, May 2002. In Conference on Programming Language Design and Implementation (PLDI).
[15]
http://www-plan.cs.colorado.edu/henkel/projects/colorado bench.
[16]
K. Cooper, P. Schielke, and D. Subramanian. Optimizing for reduced code space using genetic algorithms. In ACM Conference on Languages, Compilers, and Tools for Embedded Systems, May 1999.
[17]
K. D. Cooper, D. Subramanian, and L. Torczon. Adaptive optimizing compilers for the 21st century. Journal of Supercomputing, May 2002.
[18]
http://pag.csail.mit.edu/daikon.
[19]
J. Dean and C. Chambers. Towards better inlining decisions using inlining trials. In LISP and Functional Programming, pages 273--282, 1994.
[20]
L. P. Deutsch and A. M. Schiffman. Efficient implementation of the Smalltalk-80 system. In 11th Symposium on Principles of Programming Languages (POPL), pages 297--302, Jan. 1984.
[21]
P. C. Diniz and M. C. Rinard. Dynamic feedback: An effective technique for adaptive computing. ACM SIGPLAN Notices, 32(5):71--84, May 1997. In Conference on Programming Language Design and Implementation (PLDI).
[22]
M. Frigo. A fast Fourier transform compiler. ACM SIGPLAN Notices, 34(5):169--180, May 1999. In Conference on Programming Language Design and Implementation (PLDI).
[23]
M. Frigo and S. G. Johnson. FFTW: An adaptive software architecture for the FFT. In 1998 IEEE International Conference on Acoustics Speech and Signal Processing, volume 3, pages 1381--1384. IEEE, 1998.
[24]
G. Fursin, A. Cohen, M. O'Boyle, and O. Temam. A practical method for quickly evaluating program optimizations. In Proceedings of the 1st International Conference on High Performance Embedded Architectures & Compilers (HiPEAC 2005), number 3793 in LNCS, pages 29--46. Springer Verlag, November 2005.
[25]
N. Grcevski, A. Kilstra, K. Stoodley, M. Stoodley, and V. Sundaresan. Java just-in-time compiler and virtual machine improvements for server and middleware applications. In 3rd Virtual Machine Research and Technology Symposium (VM), May 2004.
[26]
G. J. Hansen. Adaptive Systems for the Dynamic Run-time Optimization of Programs. PhD thesis, Carnegie-Mellon University, 1974.
[27]
M. Hirzel, A. Diwan, and M. Hind. Pointer analysis in the pressence of dynamic class loading. In 18th European Conference on Object-Oriented Programming (ECOOP), volume 3086 of LNCS, pages 96--122, June 2004.
[28]
U. Hölzle, C. Chambers, and D. Ungar. Optimizing dynamicallytyped object-oriented languages with polymorphic inline caches. In 5th European Conference on Object-Oriented Programming (ECOOP), volume 512 of LNCS, pages 21--38, July 1991.
[29]
U. Hölzle and D. Ungar. Reconciling responsiveness with performance in pure object-oriented languages. Transactions on Programming Languages and Systems (TOPLAS), 18(4):355--400, July 1996.
[30]
X. Huang, S. M. Blackburn, K. S. McKinley, J. E. B. Moss, Z. Wang, and P. Cheng. The garbage collection advantage: Improving program locality. ACM SIGPLAN Notices, 39(10):69--80, Oct. 2004. In Conference on Object-Oriented Programming, Systems, Languages and Applications (OOPSLA).
[31]
E.-J. Im and K. Yelick. Optimizing sparse matrix-vector multiplication for register reuse in SPARSITY. In International Conference on Computational Science, May 2001.
[32]
E.-J. Im, K. Yelick, and R. Vuduc. SPARSITY: Optimization framework for sparse matrix kernels. International Journal of High Performance Computing Applications, 18(1), Jan. 2004.
[33]
K. Ishizaki, M. Takeuchi, K. Kawachiya, T. Suganuma, O. Gohda, T. Inagaki, A. Koseki, K. Ogata, M. Kawahito, T. Yasue, T. Ogasawara, T. Onodera, H. Komatsu, and T. Nakatani. Effectiveness of crossplatform optimizations for a Java just-in-time compiler. ACMSIGPLAN Notices, 38(11):187--204, Nov. 2003.
[34]
T. Kistler and M. Franz. Automated data-member layout of heap objects to improve memory-hierarchy performance. Transactions on Programming Languages and Systems (TOPLAS), 22(3):49--505, 2000.
[35]
T. P. Kistler and M. Franz. Continuous program optimization: Design and evaluation. IEEE Transactions on Computers, 50(6):549--566, June 2001.
[36]
P. Kulkarni, D. Whalley, G. Tyson, and J. Davidson. Exhaustive optimization phase order space exploration. In The International Symposium on Code Generation and Optimization, Mar. 2006.
[37]
P. A. Kulkarni, S. R. Hines, D. B. Whalley, J. D. Hiser, J. W. Davidson, and D. L. Jones. Fast and efficient searches for effective optimization phase sequences. ACM Transactions on Architecture and Code Optimization, 2(2):165--198, June 2005.
[38]
X. Li, M. J. Garzarán, and D. Padua. Optimizing sorting with genetic algorithms. In The International Symposium on Code Generation and Optimization, pages 99--110, Mar. 2005.
[39]
D. Maier, P. Ramarao, M. Stoodley, and V. Sundaresan. Experiences with multithreading and dynamic class loading in a Java just-in-time compiler. In The International Symposium on Code Generation and Optimization, Mar. 2006.
[40]
P. Nagpurkar, M. Hind, C. Krintz, P. F. Sweeney, and V. Rajan. Online phase detection algorithms. In The International Symposium on Code Generation and Optimization, Mar. 2006.
[41]
N. Nethercote, D. Burger, and K. S. McKinley. Self-evaluating compilation applied to loop unrolling. Technical Report TR-06-12, The University of Texas at Austin, Feb. 2006.
[42]
M. Paleczny, C. Vick, and C. Click. The Java Hotspot server compiler. In Java Virtual Machine Research and Technology Symposium (JVM), pages 1--12, Apr. 2001.
[43]
M. Püschel, J. M. F. Moura, J. Johnson, D. Padua, M. Veloso, B. W. Singer, J. Xiong, F. Franchetti, A. Gačić, Y. Voronenko, K. Chen, R. W. Johnson, and N. Rizzolo. SPIRAL: Code generation for DSP transforms. Proceedings of the IEEE, 93(2), 2005. Special issue on Program Generation, Optimization, and Adaptation.
[44]
R. M. Rabbah and K. V. Palem. Data remapping for design space optimization of embedded memory systems. ACM Transactions on Embedded Computing Systems, 2(2):1--32, May 2003.
[45]
R. Saavedra and D. Park. Improving the effectiveness of software prefetching with adaptive execution. In Conference on Parallel Architectures and Compilation Techniques, 1996.
[46]
http://www.sable.mcgill.ca/software/#soot.
[47]
Standard Performance Evaluation Corporation. SPECjbb2000 Java Business Benchmark. http://www.spec.org/jbb2000.
[48]
Standard Performance Evaluation Corporation. SPECjvm98 Benchmarks. http://www.spec.org/jvm98.
[49]
M. Stephenson and S. Amarasinghe. Predicting unroll factors using supervised classification. In The International Symposium on Code Generation and Optimization, pages 123--134, 2005.
[50]
T. Suganuma, T. Yasue, M. Kawahito, H. Komatsu, and T. Nakatani. A dynamic optimization framework for a Java just-in-time compiler. ACM SIGPLAN Notices, 36(11):180--195, Nov. 2001. In Conference on Object-Oriented Programming, Systems, Languages and Applications (OOPSLA).
[51]
The DaCapo Project. DaCapo Benchmark Suite, version beta051009. http://www-ali.cs.umass.edu/DaCapo/gcbm.html.
[52]
N. Thomas, G. Tanase, O. Tkachyshyn, J. Perdue, N. M. Amato, and L. Rauchwerger. A framework for adaptive algorithm selection in STAPL. In Symposium on Principles and Practice of Parallel Programming (PPoPP), pages 277--288, 2005.
[53]
S. Triantafyllis, M. Vachharajani, and D. August. Compiler optimization-space exploration. Journal of Instruction-Level Parallelism, 7:1--25, Jan. 2005.
[54]
M. J. Voss and R. Eigemann. High-level adaptive program optimization with ADAPT. ACM SIGPLAN Notices, 36(7):93--102, July 2001.
[55]
R. C. Whaley, A. Petitet, and J. J. Dongarra. Automated empirical optimizations of software and the ATLAS project. Parallel Computing, 27(1-2):3--35, 2001.
[56]
http://xml.apache.org/xerces2-j/index.html.
[57]
K. Yotov, X. Li, G. Ren, M. Cibulskis, G. DeJong, M. Garzaran, D. Padua, K. Pingali, P. Stodghill, and P.Wu. A comparison of empirical and model-driven optimization. In ACM SIGPLAN Conference on Programming Language Design and Implementation, June 2003.

Cited By

View all
  • (2021)Cooperative Profile Guided OptimizationsComputer Graphics Forum10.1111/cgf.1438240:8(71-83)Online publication date: 28-Nov-2021
  • (2017)Efficient Adaptive Implementation of the Serial Schedule Generation Scheme Using Preprocessing and Bloom FiltersLearning and Intelligent Optimization10.1007/978-3-319-69404-7_9(124-138)Online publication date: 25-Oct-2017
  • (2011)Using machines to learn method-specific compilation strategiesProceedings of the 9th Annual IEEE/ACM International Symposium on Code Generation and Optimization10.5555/2190025.2190072(257-266)Online publication date: 2-Apr-2011
  • Show More Cited By

Index Terms

  1. Online performance auditing: using hot optimizations without getting burned

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM SIGPLAN Notices
      ACM SIGPLAN Notices  Volume 41, Issue 6
      Proceedings of the 2006 PLDI Conference
      June 2006
      426 pages
      ISSN:0362-1340
      EISSN:1558-1160
      DOI:10.1145/1133255
      Issue’s Table of Contents
      • cover image ACM Conferences
        PLDI '06: Proceedings of the 27th ACM SIGPLAN Conference on Programming Language Design and Implementation
        June 2006
        438 pages
        ISBN:1595933204
        DOI:10.1145/1133981
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 11 June 2006
      Published in SIGPLAN Volume 41, Issue 6

      Check for updates

      Author Tags

      1. Java
      2. feedback-directed optmizations
      3. virtual machines

      Qualifiers

      • Article

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)8
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 05 Mar 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2021)Cooperative Profile Guided OptimizationsComputer Graphics Forum10.1111/cgf.1438240:8(71-83)Online publication date: 28-Nov-2021
      • (2017)Efficient Adaptive Implementation of the Serial Schedule Generation Scheme Using Preprocessing and Bloom FiltersLearning and Intelligent Optimization10.1007/978-3-319-69404-7_9(124-138)Online publication date: 25-Oct-2017
      • (2011)Using machines to learn method-specific compilation strategiesProceedings of the 9th Annual IEEE/ACM International Symposium on Code Generation and Optimization10.5555/2190025.2190072(257-266)Online publication date: 2-Apr-2011
      • (2011)An evaluation of different modeling techniques for iterative compilationProceedings of the 14th international conference on Compilers, architectures and synthesis for embedded systems10.1145/2038698.2038711(65-74)Online publication date: 9-Oct-2011
      • (2009)Context-aware code optimization2009 IEEE 28th International Performance Computing and Communications Conference10.1109/PCCC.2009.5403838(256-263)Online publication date: Dec-2009
      • (2009)Blind Optimization for Exploiting Hardware FeaturesProceedings of the 18th International Conference on Compiler Construction: Held as Part of the Joint European Conferences on Theory and Practice of Software, ETAPS 200910.1007/978-3-642-00722-4_18(251-265)Online publication date: 27-Mar-2009
      • (2008)QVMACM SIGPLAN Notices10.1145/1449955.144977643:10(143-162)Online publication date: 19-Oct-2008
      • (2008)QVMProceedings of the 23rd ACM SIGPLAN conference on Object-oriented programming systems languages and applications10.1145/1449764.1449776(143-162)Online publication date: 19-Oct-2008
      • (2008)Branch-on-randomProceedings of the 6th annual IEEE/ACM international symposium on Code generation and optimization10.1145/1356058.1356070(84-93)Online publication date: 6-Apr-2008
      • (2008)Intelligent compilers2008 IEEE International Conference on Cluster Computing10.1109/CLUSTR.2008.4663796(360-368)Online publication date: Sep-2008
      • Show More Cited By

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media