skip to main content
research-article

Leveraging Core Specialization via OS Scheduling to Improve Performance on Asymmetric Multicore Systems

Published:01 April 2012Publication History
Skip Abstract Section

Abstract

Asymmetric multicore processors (AMPs) consist of cores with the same ISA (instruction-set architecture), but different microarchitectural features, speed, and power consumption. Because cores with more complex features and higher speed typically use more area and consume more energy relative to simpler and slower cores, we must use these cores for running applications that experience significant performance improvements from using those features. Having cores of different types in a single system allows optimizing the performance/energy trade-off. To deliver this potential to unmodified applications, the OS scheduler must map threads to cores in consideration of the properties of both. Our work describes a Comprehensive scheduler for Asymmetric Multicore Processors (CAMP) that addresses shortcomings of previous asymmetry-aware schedulers. First, previous schedulers catered to only one kind of workload properties that are crucial for scheduling on AMPs; either efficiency or thread-level parallelism (TLP), but not both. CAMP overcomes this limitation showing how using both efficiency and TLP in synergy in a single scheduling algorithm can improve performance. Second, most existing schedulers relying on models for estimating how much faster a thread executes on a “fast” vs. “slow” core (i.e., the speedup factor) were specifically designed for AMP systems where cores differ only in clock frequency. However, more realistic AMP systems include cores that differ more significantly in their features. To demonstrate the effectiveness of CAMP on more realistic scenarios, we augmented the CAMP scheduler with a model that predicts the speedup factor on a real AMP prototype that closely matches future asymmetric systems.

References

  1. Annavaram, M., Grochowski, E., and Shen, J. 2005. Mitigating Amdahl’s law through EPI throttling. In Proceedings of the International Symposium on Computer Architecture (ISCA’05). 298--309. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. ARM. 2011. Big.LITTLE Processing with ARM CortexTM-A15 & Cortex-A7. White paper, http://www.arm.com/files/downloads/big_LITTLE_Final_Final.pdf.Google ScholarGoogle Scholar
  3. Balakrishnan, S., Rajwar, R., Upton, M., and Lai, K. 2005. The impact of performance asymmetry in emerging multicore architectures. SIGARCH Comput. Architect. News 33, 2, 506--517. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Becchi, M. and Crowley, P. 2006. Dynamic thread assignment on heterogeneous multiprocessor architectures. In Proceedings of the International Conference on Computing Frontiers (CF’06). 29--40. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Blagodurov, S., Zhuravlev, S., and Fedorova, A. 2010. Contention-aware scheduling on multicore systems. ACM Trans. Comput. Syst. 28, 8:1--8:45. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Constantinou, T., Sazeides, Y., Michaud, P., Fetis, D., and Seznec, A. 2005. Performance implications of single thread migration on a chip multi-core. SIGARCH Comput. Architect. News 33, 80--91. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Freeh, V. W., Lowenthal, D. K., Pan, F., Kappiah, N., Springer, R., and Rountree, B. L. 2007. Analyzing the energy-time trade-off in high-performance computing applications. IEEE Trans. Parall. Distrib. Syst. 18, 6, 835--848. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Friedman, J. H. 1999. Stochastic gradient boosting. www-stat.stanford.edu~jhf/ftp/stobst/pdf.Google ScholarGoogle Scholar
  9. Gillespie, M. 2008. Preparing for the second stage of multi-core hardware: Asymmetric (heterogeneous) cores. Intel white paper.Google ScholarGoogle Scholar
  10. Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., and Witten, I. H. 2009. The WEKA data mining software: An update. SIGKDD Explor. Newsl. 11, 10--18. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Hill, M. D. and Marty, M. R. 2008. Amdahl’s law in the multicore era. IEEE Comput. 41, 7, 33--38. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Koufaty, D., Reddy, D., and Hahn, S. 2010. Bias scheduling in heterogeneous multi-core architectures. In Proceedings of Eurosys’10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Kumar, R., Farkas, K. I., Jouppi, N., et al. 2003. Single-ISA heterogeneous multi-core architectures: The potential for processor power reduction. In Proceedings of the Annual ACM/IEEE International Symposium on Microarchitecture (MICRO’03). Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Kumar, R., Tullsen, D. M., Ranganathan, P., et al. 2004. Single-ISA heterogeneous multi-core architectures for multithreaded workload performance. In Proceedings of the International Symposium on Computer Architecture (ISCA’04). Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Li, T., Baumberger, D., Koufaty, D., et al. 2007. Efficient operating system scheduling for performance-asymmetric multi-core architectures. In Proceedings of the Conference on Supercomputing (SC’07). 1--11. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Li, T., Brett, P., Knauerhase, R., Koufaty, D., Reddy, D., and Hahn, S. 2010. Operating system support for overlapping-ISA heterogeneous multicore architectures. In Proceedings of the 16th International Symposium on High Performance Computer Architecture (HPCA’10). 1--12.Google ScholarGoogle Scholar
  17. Mogul, J. C., Mudigonda, J., Binkert, N., Ranganathan, P., and Talwar, V. 2008. Using asymmetric single-ISA CMPs to save energy on operating systems. IEEE Micro 28, 3, 26--41. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Morad, T., Weiser, U., and Kolody, A. 2004. ACCMP---Asymmetric cluster chip multi-processing. CCIT Tech. rep #448.Google ScholarGoogle Scholar
  19. Saez, J. C., Fedorova, A., Prieto, M., et al. 2010a. A comprehensive scheduler for asymmetric multicore systems. In Proceedings of Eurosys’10. 139--152. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Saez, J. C., Fedorova, A., Prieto, M., et al. 2010b. Operating system support for mitigating software scalability bottlenecks on asymmetric multicore processors. In Proceedings of the International Conference on Computing Frontiers (CF’10). 31--40. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Saez, J. C., Shelepov, D., Fedorova, A., and Prieto, M. 2011. Leveraging workload diversity through OS scheduling to maximize performance on single-ISA heterogeneous multicore systems. J. Parall. Distrib. Comput. 71, 114--131. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Shelepov, D., Saez, J. C., Jeffery, S., et al. 2009. HASS: A scheduler for heterogeneous multicore systems. ACM SIGOPS Op. Syst. Rev. 43, 2, 66--75. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Suleman, M. A., Mutlu, O., Qureshi, M. K., and Patt, Y. N. 2009. Accelerating critical section execution with asymmetric multi-core architectures. In Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’09). 253--264. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Suleman, M. A., Qureshi, M. K., and Patt, Y. N. 2008. Feedback-driven threading: Power-efficient and high-performance execution of multi-threaded workloads on CMPs. SIGARCH Comput. Architect. News 36, 1, 277--286. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Tam, D., Azimi, R., and Stumm, M. 2007. Thread clustering: sharing-aware scheduling on SMP-CMP-SMT multiprocessors. In Proceedings of EuroSys’07. 47--58. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. van der Pas, R. 2005. The OMPlab on Sun Systems. In Proceedings of the International Workshop on OpenMP (IWOMP’05).Google ScholarGoogle Scholar

Index Terms

  1. Leveraging Core Specialization via OS Scheduling to Improve Performance on Asymmetric Multicore Systems

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Computer Systems
      ACM Transactions on Computer Systems  Volume 30, Issue 2
      April 2012
      111 pages
      ISSN:0734-2071
      EISSN:1557-7333
      DOI:10.1145/2166879
      Issue’s Table of Contents

      Copyright © 2012 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 1 April 2012
      • Accepted: 1 January 2012
      • Revised: 1 December 2011
      • Received: 1 March 2011
      Published in tocs Volume 30, Issue 2

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader