skip to main content
10.1145/2807591.2807638acmconferencesArticle/Chapter ViewAbstractPublication PagesscConference Proceedingsconference-collections
research-article
Public Access

Analyzing and mitigating the impact of manufacturing variability in power-constrained supercomputing

Published:15 November 2015Publication History

ABSTRACT

A key challenge in next-generation supercomputing is to effectively schedule limited power resources. Modern processors suffer from increasingly large power variations due to the chip manufacturing process. These variations lead to power inhomogeneity in current systems and manifest into performance inhomogeneity in power constrained environments, drastically limiting supercomputing performance. We present a first-of-its-kind study on manufacturing variability on four production HPC systems spanning four microarchitectures, analyze its impact on HPC applications, and propose a novel variation-aware power budgeting scheme to maximize effective application performance. Our low-cost and scalable budgeting algorithm strives to achieve performance homogeneity under a power constraint by deriving application-specific, module-level power allocations. Experimental results using a 1,920 socket system show up to 5.4X speedup, with an average speedup of 1.8X across all benchmarks when compared to a variation-unaware power allocation scheme.

References

  1. NASA Advanced Supercomputing Division, NAS Parallel Benchmark Suite v3.3. 2006. http://www.nas.nasa.gov/Resources/Software/npb.html.Google ScholarGoogle Scholar
  2. 2013 Exascale Operating and Runtime Systems. Technical report, Advanced Science Computing Research (ASCR), February 2013. http://science.doe.gov/grants/pdf/LAB13-02.pdf.Google ScholarGoogle Scholar
  3. AMD. AMD Turbo CORE Technology. http://www.amd.com/us/products/desktop/processors/phenom-ii/Pages/phenom-ii-key-architectural-features.aspx.Google ScholarGoogle Scholar
  4. S. Ashby, P. Beckman, J. Chen, P. Colella, B. Collins, D. Crawford, J. Dongarra, D. Kothe, R. Lusk, P. Messina, T. Mezzacappa, P. Moin, M. Norman, R. Rosner, V. Sarkar, A. Siegel, F. Streitz, A. White, and M. Wright. The Opportunities and Challenges of Exascale Computing. 2010.Google ScholarGoogle Scholar
  5. P. Bailey, D. Lowenthal, V. Ravi, B. Rountree, M. Schulz, and B. de Supinski. Adaptive Configuration Selection for Power-Constrained Heterogeneous Systems. In International Conference on Parallel Processing, ICPP '14, 2014.Google ScholarGoogle Scholar
  6. D. Bodas, J. Song, M. Rajappa, and A. Hoffman. Simple Power-aware Scheduler to Limit Power Consumption by HPC System Within a Budget. In Proceedings of the 2nd International Workshop on Energy Efficient Supercomputing, pages 21--30. IEEE Press, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. S. Borkar. Designing Reliable Systems from Unreliable Components: The Challenges of Transistor Variability and Degradation. Micro, IEEE, 25(6):10--16, Nov 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. S. Borkar, T. Karnik, S. Narendra, J. Tschanz, A. Keshavarzi, and V. De. Parameter Variations and Impact on Circuits and Microarchitecture. In Proceedings of the 40th annual Design Automation Conference, pages 338--342, June 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. K. W. Cameron, X. Feng, and R. Ge. Performance-constrained, Distributed DVS Scheduling for Scientific Applications on Power-aware Clusters. In Supercomputing, November 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. N. P. Carter, A. Agrawal, S. Borkar, R. Cledat, H. David, D. Dunning, J. B. Fryman, I. Ganev, R. A. Golliver, R. C. Knauerhase, R. Lethin, B. Meister, A. K. Mishra, W. R. Pinfold, J. Teller, J. Torrellas, N. Vasilache, G. Venkatesh, and J. Xu. Runnemede: An Architecture for Ubiquitous High-Performance Computing. In 19th IEEE International Symposium on High Performance Computer Architecture, HPCA 2013, Shenzhen, China, February 23--27, 2013, pages 198--209, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. M. Curtis-Maury, F. Blagojevic, C. D. Antonopoulos, and D. S. Nikolopoulos. Prediction-Based Power-Performance Adaptation of Multithreaded Scientific Codes. IEEE Trans. Parallel Distrib. Syst., 19(10):1396--1410, Oct. 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. H. David, E. Gorbatov, U. Hanebutte, R. Khanna, and C. Le. RAPL: Memory Power Estimation and Capping. In Proceedings of the 16th ACM/IEEE international symposium on Low power electronics and design, ISLPED '10, pages 189--194, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. D. DeBonis, J. H. Laros III, and K. Pedretti. Qualification for PowerInsight Accuracy of Power Measurements.Google ScholarGoogle Scholar
  14. S. Dighe, S. Vangal, P. Aseron, S. Kumar, T. Jacob, K. Bowman, J. Howard, J. Tschanz, V. Erraguntla, N. Borkar, V. De, and S. Borkar. Within-Die Variation-Aware Dynamic-Voltage-Frequency-Scaling With Optimal Core Allocation and Thread Hopping for the 80-Core TeraFLOPS Processor. Solid-State Circuits, IEEE Journal of, 46(1):184--193, Jan 2011.Google ScholarGoogle Scholar
  15. D. A. Ellsworth, A. D. Malony, B. Rountree, and M. Schulz. POW: System-wide Dynamic Reallocation of Limited Power in HPC. June 2015.Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. H. Esmaeilzadeh, E. Blem, R. S. Amant, K. Sankaralingam, and D. Burger. Power Challenges May End the Multicore Era. Commun. ACM, 56(2):93--102, Feb. 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. H. Esmaeilzadeh, E. Blem, R. St. Amant, K. Sankaralingam, and D. Burger. Dark Silicon and the End of Multicore Scaling. In Proceedings of the 38th Annual International Symposium on Computer Architecture, ISCA '11, pages 365--376, New York, NY, USA, 2011. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. M. Etinski, J. Corbalan, J. Labarta, and M. Valero. Optimizing Job Performance Under a Given Power Constraint in HPC Centers. In Green Computing Conference, pages 257--267, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. M. Etinski, J. Corbalan, J. Labarta, and M. Valero. Linear Programming Based Parallel Job Scheduling for Power Constrained Systems. In International Conference on High Performance Computing and Simulation, pages 72--80, 2011.Google ScholarGoogle Scholar
  20. D. Feitelson, U. Schwiegelshohn, and L. Rudolph. Parallel Job Scheduling - A Status Report. In In Lecture Notes in Computer Science, pages 1--16. Springer-Verlag, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. R. Ge, X. Feng, W. Feng, and K. W. Cameron. CPU MISER: A Performance-Directed, Run-Time System for Power-Aware Clusters. In International Conference on Parallel Processing, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. L. R. Harriott. Limits of lithography. Proceedings of the IEEE, 89(3):366--374, 2001.Google ScholarGoogle ScholarCross RefCross Ref
  23. C. Hsu and W. Feng. A Power-Aware Run-Time System for High-Performance Computing. In Supercomputing, November 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. J. Hu, C. J. Xue, Q. Zhuge, W.-C. Tseng, and E.-M. Sha. Towards Energy Efficient Hybrid On-chip Scratch Pad Memory with Non-volatile Memory. In Design, Automation & Test in Europe Conference & Exhibition (DATE), 2011, pages 1--6. IEEE, 2011.Google ScholarGoogle Scholar
  25. J. H. L. III, P. Pokorny, and D. Debonis. PowerInsight - A commodity power measurement capability. In IGCC'13, pages 1--6, 2013.Google ScholarGoogle Scholar
  26. Intel. Intel Turbo Boost Technology 2.0. http://www.intel.com/content/www/us/en/architecture-and-technology/turbo-boost/turbo-boost-technology.html.Google ScholarGoogle Scholar
  27. Intel. Intel-64 and IA-32 Architectures Software Developer's Manual, Volumes 3A and 3B: System Programming Guide. 2011.Google ScholarGoogle Scholar
  28. S. Jilla. Minimizing The Effects Of Manufacturing Variation During Physcial Layout. http://chipdesignmag.com/display.php?articleId=2437.Google ScholarGoogle Scholar
  29. L. Kalé and S. Krishnan. CHARM++: A Portable Concurrent Object Oriented System Based on C++. In Proceedings of OOPSLA'93. ACM Press, September 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. N. Kappiah, V. W. Freeh, and D. K. Lowenthal. Just In Time Dynamic Voltage Scaling: Exploiting Inter-Node Slack to Save Energy in MPI Programs. Journal of Parallel and Distributed Computing, 68:1175--1185, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. T. Karnik, M. Pant, and S. Borkar. Power management and delivery for high-performance microprocessors. In The 50th Annual Design Automation Conference 2013, DAC '13, Austin, TX, USA, May 29 - June 07, 2013, page 159, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. H. Kaul, M. Anders, S. Hsu, A. Agarwal, R. Krishnamurthy, and S. Borkar. Near-threshold Voltage (NTV) Design: Opportunities and Challenges. In The 49th Annual Design Automation Conference 2012, DAC '12, June 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Z. Liu, J. Lofstead, T. Wang, and W. Yu. A case of system-wide power management for scientific applications. In Cluster Computing (CLUSTER), 2013 IEEE International Conference on, September 2013.Google ScholarGoogle ScholarCross RefCross Ref
  34. P. Luszczek, D. Bailey, J. Dongarra, J. Kepner, R. Lucas, R. Rabenseifner, and D. Takahash. HPC Challenge Benchmark Suite. ttp://icl.cs.utk.edu/pcc/index.html.Google ScholarGoogle Scholar
  35. A. Marathe, P. Bailey, D. K. Lowenthal, B. Rountree, M. Schulz, and B. R. de Supinski. A Run-time System for Power-constrained HPC Applications. In In International Supercomputing Conference (ISC, July 2015.Google ScholarGoogle Scholar
  36. N. Maruyama, S. Suzuki, K. Mikami, Y. Komuro, S. Takizawa, and M. Matsuda. Fiber Miniapp Suite. fiber-miniapp.github.io.Google ScholarGoogle Scholar
  37. S. Miwa, S. Aita, and H. Nakamura. Performance Estimation of High Performance Computing Systems with Energy Efficient Ethernet Technology. Computer Science - Research and Development, 29:161--169, August 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. T. Ogino, R. J. Walker, and M. Ashour-Abdalla. A Global Magnetohydrodynamic Simulation of the Magnetopause when the Interplanetary Magnetic Field is Northward. IEEE Transaction on Plasma Science, 20:817--828, December 1992.Google ScholarGoogle ScholarCross RefCross Ref
  39. T. Patki, D. K. Lowenthal, B. Rountree, M. Schulz, and B. R. de Supinski. Exploring Hardware Overprovisioning in Power-constrained, High Performance Computing. In International Conference on Supercomputing, pages 173--182, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. T. Patki, A. Sasidharan, M. Maiterth, D. Lowenthal, B. Rountree, M. Schulz, and B. de Supinski. Practical Resource Management in Power-Constrained, High Performance Computing. In High Performance Parallel and Distributed Computing (HPDC), June 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. B. Rountree, D. H. Ahn, B. R. de Supinski, D. K. Lowenthal, and M. Schulz. Beyond DVFS: A First Look at Performance under a Hardware-Enforced Power Bound. In IPDPS Workshops (HPPAC), pages 947--953. IEEE Computer Society, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. B. Rountree, D. Lowenthal, B. de Supinski, M. Schulz, V. Freeh, and T. Bletch. Adagio: Making DVS Practical for Complex HPC Applications. In International Conference on Supercomputing, June 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. B. Rountree, D. Lowenthal, M. Schulz, and B. de Supinski. Practical Performance Prediction Under Dynamic Voltage Frequency Scaling. In International Green Computing Conference, July 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. S. Samaan. The Impact of Device Parameter Variations on the Frequency and Performance of VLSI Chips. In Computer Aided Design, 2004. ICCAD-2004. IEEE/ACM International Conference on, pages 343--346, Nov 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. V. Sarkar, W. Harrod, and A. Snavely. Software Challenges in Extreme Scale Systems. In Journal of Physics, Conference Series 012045, 2009.Google ScholarGoogle ScholarCross RefCross Ref
  46. O. Sarood. Optimizing Performance Under Thermal and Power Constraints for HPC Data Centers. PhD thesis, University of Illinois, Urbana-Champaign, December 2013.Google ScholarGoogle Scholar
  47. O. Sarood, A. Langer, A. Gupta, and L. V. Kale. Maximizing Throughput of Overprovisioned HPC Data Centers Under a Strict Power Budget. In Supercomputing, Nov. 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. S. Shende and A. D. Malony. The Tau Parallel Performance System. IJHPCA, 20(2):287--311, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. K. Shoga, B. Rountree, and M. Schulz. Whitelisting MSRs with msr-safe. November 2014.Google ScholarGoogle Scholar
  50. R. Teodorescu and J. Torrellas. Variation-Aware Application Scheduling and Power Management for Chip Multiprocessors. In Computer Architecture, 2008. ISCA '08. 35th International Symposium on, pages 363--374, June 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. E. Totoni, A. Langer, J. Torrellas, and L. Kale. Scheduling for HPC Systems with Process Variation Heterogeneity. January 2015.Google ScholarGoogle Scholar
  52. J. Tschanz, J. Kao, S. Narendra, R. Nair, D. Antoniadis, A. Chandrakasan, and V. De. Adaptive Body Bias for Reducing Impacts of Die-to-die and Within-die Parameter Variations on Microprocessor Frequency and Leakage. Solid-State Circuits, IEEE Journal of, 37(11):1396--1402, Nov 2002.Google ScholarGoogle Scholar
  53. S. Wallace, V. Vishwanath, S. Coghlan, Z. Lan, and M. E. Papka. Measuring power consumption on IBM Blue Gene/Q. In Parallel and Distributed Processing Symposium Workshops & PhD Forum (IPDPSW), 2013 IEEE 27th International, pages 853--859. IEEE, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. S. Wallace, V. Vishwanath, S. Coghlan, J. Tramm, Z. Lan, and M. Papkay. Application power profiling on IBM Blue Gene/Q. In Cluster Computing (CLUSTER), 2013 IEEE International Conference on, pages 1--8. IEEE, 2013.Google ScholarGoogle ScholarCross RefCross Ref
  55. K. Yoshii, K. Iskra, R. Gupta, P. Beckman, V. Vishwanath, C. Yu, and S. Coghlan. Evaluating power-monitoring capabilities on IBM Blue Gene/P and Blue Gene/Q. In Cluster Computing (CLUSTER), 2012 IEEE International Conference on, pages 36--44. IEEE, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Analyzing and mitigating the impact of manufacturing variability in power-constrained supercomputing

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          SC '15: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis
          November 2015
          985 pages
          ISBN:9781450337236
          DOI:10.1145/2807591
          • General Chair:
          • Jackie Kern,
          • Program Chair:
          • Jeffrey S. Vetter

          Copyright © 2015 ACM

          © 2015 Association for Computing Machinery. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of the United States government. As such, the United States Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 15 November 2015

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          SC '15 Paper Acceptance Rate79of358submissions,22%Overall Acceptance Rate1,516of6,373submissions,24%

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader