Article

Efficient, Unified, and Scalable Performance Monitoring for Multiprocessor Operating Systems

Authors:
Robert W. Wisniewski

IBM T. J. Watson Research Center

IBM T. J. Watson Research Center
View Profile

,
Bryan Rosenburg

IBM T. J. Watson Research Center

IBM T. J. Watson Research Center
View Profile

SC '03: Proceedings of the 2003 ACM/IEEE conference on SupercomputingNovember 2003https://doi.org/10.1145/1048935.1050154

Published:15 November 2003Publication History

SC '03: Proceedings of the 2003 ACM/IEEE conference on Supercomputing

ABSTRACT

Programming, understanding, and tuning the performance of large multiprocessor systems is challenging. Experts have difficulty achieving good utilization for applications on large machines. The task of implementing a scalable system such as an operating system or database on large machines is even more challenging. And the importance of achieving good performance on multiprocessor machines is increasing as the number of cores per chip increases and as the size of multiprocessors increases. Crucial to achieving good performance is being able to understand the behavior of the system. We have developed an efficient, unified, and scalable tracing infrastructure that allows for correctness debugging, performance debugging, and performance monitoring of an operating system. The infrastructure allows variable-length events to be logged without locking and provides random access to the event stream. The infrastructure allows cheap and parallel logging of events by applications, libraries, servers, and the kernel. The infrastructure was designed for K42, a new open-source research kernel designed to scale near perfectly on large cache-coherent 64-bit multiprocessor systems. The techniques are generally applicable, and many of them have been integrated into the Linux Trace Toolkit. In this paper, we describe the implementation of the infrastructure, how we used the facility, e.g., analyzing lock contention, to understand and achieve K42's scalable performance, and the lessons we learned. The infrastructure has been invaluable to achieving great scalability.

References

{1} Jonathan Appavoo, Marc Auslander, David Edelsohn, Dilma da Silva, Orran Krieger, Michal Ostrowski, Bryan Rosenburg, Robert W. Wisniewski, and Jimi Xenidis. Providing a Linux API on the scalable K42 kernel. In Freenix, pages 323-336, San Antonio, TX, June 9-14 2003.Google Scholar
{2} Marc Auslander, David Edelsohn, Dilma da Silva, Orran Krieger, Michal Ostrowski, Bryan Rosenburg, Robert W. Wisniewski, and Jimi Xenidis. K42 Overview. IBM Research, http://www.research.ibm.com/K42, August 2002.Google Scholar
{3} Marc Auslander, David Edelsohn, Dilma da Silva, Orran Krieger, Michal Ostrowski, Bryan Rosenburg, Robert W. Wisniewski, and Jimi Xenidis. K42's Performance Monitoring and Tracing. IBM Research, http://www.research.ibm.com/K42, August 2002.Google Scholar
{4} IBM Linux Technology Center. Dynamic probes. http://www- 124.ibm.com/developerworks/oss/linux/projects/dprobes/.Google Scholar
{5} IBM Corporation. Aix version 3.1 for risc system/6000 performance monitoring and tuning guide. Technical Report SC23-2365- 00, IBM Corporation.Google Scholar
{6} Dyninst. An application program interface (api) for runtime code generation. http://www.dyninst.org/.Google Scholar
{7} D. Kohr, X. Zhang, M. Rahman, and D. Reed. A performance study of an object-oriented parallel operating system. In Proceedings of the 27th Hawaii International Conference on System Sciences , November 27 2000.Google Scholar
{8} Barton P. Miller, Mark D. Callaghan, Jonathan M. Cargille, Jeffrey K. Hollingsworth, R. Bruce Irvin, Karen L. Karavanic, Krishna Kunchithapadam, and Tia Newhall. The paradyn parallel performance measurement tools. IEEE Computer, 28(11):37-46, November 1995. Google ScholarDigital Library
{9} Daniel A. Reed, James Arendt, Ruth Aydt, Thomas Birkett, David Jensen, Tara Madhyastha, Bobby Nazief, Ted Nelson, Robert Olson, and Brian Totty. Scalable performance environments for parallel systems. In Sixth Distributed Memory Computing Conference , pages 562-569, Portland OR, April-May 1991.Google ScholarCross Ref
{10} Craig A. N. Soules, Jonathan Appavoo, Kevin Hui, Robert W. Wisniewski, Dilma da Silva, Gregory R. Ganger, Orran Krieger, Michael Stumm, Marc Auslander, Michal Ostrowski, Bryan Rosenburg, and Jimi Xenidis. System support for online reconfiguration. In USENIX, pages 141-154, San Antonio, TX, June 9-14 2003.Google Scholar
{11} John Stasko, John Domingue, Marc H. Brown, and Blaine A. Price. Software Visualization, volume 1, chapter 20 Visualization of Dynamics in Real World Software Systems, Doug Kimelman, Bryan Rosenburg, and Tova Roth, pages 293-314. MIT Press, 1998.Google Scholar
{12} Ariel Tamches and Barton P. Miller. Fine-grained dynamic instrumentation of commodity operating system kernels. In OSDI 99: Third Symposium on Operating Systems Design and Implementation , pages 117-130, New Orleans, February 1999. Google ScholarDigital Library
{13} Christian Thiffault, Michael Voss, Steven T. Healey, and Seon Wook Kim. Dynamic instrumentation of large-scale mpi/openmp applications. In IPDPS 2003: International Parallel and Distributed Processing Symposium, page to appear, Nice France, April 2003. Google ScholarDigital Library
{14} Jeffrey S. Vetter and Daniel A. Reed. Managing performance analysis with dynamic statistical projection pursuit. In SC 99 Proceedings of SC 99, page electronic publication, Portland OR, November 1999. Google ScholarDigital Library
{15} Robert W. Wisniewski and Luis F. Stevens. A model and tools for supporting parallel real-time applications in unix environments. In Proceedings of The 12th IEEE Real-Time Technology and Applications Symposium, pages 126-133, Chicago Illinois, May 15-17 1995. Google ScholarDigital Library
{16} Karim Yaghmour. Ltt web page. http://www.opersys.com/LTT/index.html.Google Scholar
{17} Karim Yaghmour. Measuring and characterizing system behavior using kernel-level event logging. In Proceedings of the 2000 USENIX Annual Technical Conference, June 2000. Google ScholarDigital Library
{18} Tom Zanussi, Karim Yaghmour, Robert W. Wisniewski, Michel Dagenais, and Richard Moore. An efficient unified approach for trasmitting data from kernel to user space. In OLS 2003 - Ottawa Linux Symposium, page to appear, July 23-26 2003.Google Scholar

Recommendations

Silicon-photonic network architectures for scalable, power-efficient multi-chip systems
ISCA '10: Proceedings of the 37th annual international symposium on Computer architecture

Scaling trends of logic, memories, and interconnect networks lead towards dense many-core chips. Unfortunately, process yields and reticle sizes limit the scalability of large single-chip systems. Multi-chip systems break free of these areal limits, but ...
Read More
Silicon-photonic network architectures for scalable, power-efficient multi-chip systems
ISCA '10

Scaling trends of logic, memories, and interconnect networks lead towards dense many-core chips. Unfortunately, process yields and reticle sizes limit the scalability of large single-chip systems. Multi-chip systems break free of these areal limits, but ...
Read More
Survey of Operating Systems
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SC '03: Proceedings of the 2003 ACM/IEEE conference on Supercomputing
November 2003
859 pages
ISBN:1581136951
DOI:10.1145/1048935
General Chair:
James R. McGraw
Copyright © 2003 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 15 November 2003
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Qualifiers
- Article
Conference

Acceptance Rates
SC '03 Paper Acceptance Rate60of207submissions,29%Overall Acceptance Rate1,516of6,373submissions,24%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 21
  Total Citations
  View Citations
- 433
  Total Downloads
- Downloads (Last 12 months)3
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Efficient, Unified, and Scalable Performance Monitoring for Multiprocessor Operating Systems

SC '03: Proceedings of the 2003 ACM/IEEE conference on Supercomputing

ABSTRACT

References

Cited By

Recommendations

Silicon-photonic network architectures for scalable, power-efficient multi-chip systems

Silicon-photonic network architectures for scalable, power-efficient multi-chip systems

Survey of Operating Systems

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Efficient, Unified, and Scalable Performance Monitoring for Multiprocessor Operating Systems

SC '03: Proceedings of the 2003 ACM/IEEE conference on Supercomputing

ABSTRACT

References

Cited By

Recommendations

Silicon-photonic network architectures for scalable, power-efficient multi-chip systems

Silicon-photonic network architectures for scalable, power-efficient multi-chip systems

Survey of Operating Systems

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media