skip to main content
article
Free Access

The SPLASH-2 programs: characterization and methodological considerations

Authors Info & Claims
Published:01 May 1995Publication History
Skip Abstract Section

Abstract

The SPLASH-2 suite of parallel applications has recently been released to facilitate the study of centralized and distributed shared-address-space multiprocessors. In this context, this paper has two goals. One is to quantitatively characterize the SPLASH-2 programs in terms of fundamental properties and architectural interactions that are important to understand them well. The properties we study include the computational load balance, communication to computation ratio and traffic needs, important working set sizes, and issues related to spatial locality, as well as how these properties scale with problem size and the number of processors. The other, related goal is methodological: to assist people who will use the programs in architectural evaluations to prune the space of application and machine parameters in an informed and meaningful way. For example, by characterizing the working sets of the applications, we describe which operating points in terms of cache size and problem size are representative of realistic situations, which are not, and which re redundant. Using SPLASH-2 as an example, we hope to convey the importance of understanding the interplay of problem size, number of processors, and working sets in designing experiments and interpreting their results.

References

  1. Bai90 David H. Bailey. FFT's in External or Hierarchical Memory. Journal ot Supercomputing, 4( 1):23-35, March 1990 Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. BLM+91 Guy E Blelloch, Charles E Leiserson, Bruce M. Maggs, C. Greg Plaxton, Stephen J. Smith, and Marco Zagha. A Comparison of Sorting Algorithms for the Connection Machine CM-2. In Proceedings of the Symposium on Parallel Algorithms and Architectures, pp. 3-16, July 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Bra77 Achi Brandt. Multi-Level Adaptive Solutions to Boundary- Value Problems. Mathematics of Computation 31(138):333- 390.Google ScholarGoogle Scholar
  4. Den68 Peter J. Denning. The Working Set Model for Program Behavior. Communtcations of the ACM, 11(5):323-333. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. DSR+93 Michel Dubois, Jonas Skeppstedt, Livio Ricciulli, Krishnan Ramamurthy, and Per Stenstrom. The Detection and Elimination of Useless Mxsses in Multiprocessors. In Proceedings o.{ the 20th International Symposium on Computer Architecture, pp. 88-97, May 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. EgK89 Susan J. Eggers and Randy H. Katz. The Effects of Shanng on the Cache and Bus Performance of Parallel Programs. In Proceedings of the Third International Conjerence on Archttectural Support .for Programming Languages and Operating Systems (ASPLOS III), pp. 257-270, April 1989. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. FoW78 S. Fortune and J. Wyllie. Parallelism in Random Access Machines, In Proceedings of the Tenth A CM Symposium on Theory of Computing, May 1978. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Gol93 Stephen Goldschmidt. Simulation of Multiprocessors: Accuracy and Performance. Ph.D. Thesis, Stanford University, June 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Gre87 Leslie Greengard. The Rapid Evaluation of Potential Fields in Particle Systems. ACM Press. 1987.Google ScholarGoogle Scholar
  10. GuW92 Anoop Gupta and Wolf-Dietrich Weber. Cache invalidation Patterns in Shared-Memory Muluprocessors. IEEE Transactions on Computers, 41(7):794-810, July 1992. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. HHS+95 Chris Holt, Mark Heinrich, Jaswinder Pal Singh, Edward Rothberg, and John Hennessy. The Effects of Latency, Occupancy, and Bandwidth in Distributed Shared Memory Multiprocessors. Stanford University Technical Report No. CSL- TR-95-660. January 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. HSA91 Pat Hanrahan, David $ali~rnan and Larry Aupperle, "'A Rapid Hierarchical Radiosity Algorithm", In Proceedings oj SIG- GRAPH 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. HoS95 Chris Holt and Jaswinder Pal Singh. Hierarchical N-Body Methods on Shared Address Space Multiprocessors. In Proceedmgs oJ the Seventh SIAM International Conference on Parallel Processing .for Scientific Computing, pp. 313-318, Feb 1995.Google ScholarGoogle Scholar
  14. NiL92 Jason Nieh and Marc Levoy, "Volume Rendering on Scalable Shared-Memory MIMD Architectures", In Proceedings of the Boston Workshop on Volume Visualization, October 1992. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. PaP84 M. Papamarcos and J. Patel. A Low Overhead Coherence Solution for Multiprocessors with Private Cache Memories. In Proceedings oJ the 11th international Symposium on Computer Architecture, pp. 348-354, 1984. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. RSG93 Edward Rothberg, Jaswinder Pal Singh, and Anoop Gupta. Working Sets, Cache Sizes, and Node Granularity Issues for Large-Scale Multiprocessors. In Proceedings o{ the 20th International Symposium on Computer Architecture, pp. 14-25, May 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. SGL94 Jaswinder Pal Singh, Anoop Gupta and Marc Levoy, "Parallel Visualization Algorithms: Performance and Architectural Implications", IEEE Computer 27(7):45-55, July 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. SWG92 Jaswinder Pal Singh, Wolf-Dietrich Weber, and Anoop Gupta. SPLASH: Stanford Parallel Applications for Shared Memory. Computer Architecture News, 20( 1):5-44, March 1992. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. TLH94 Josep Torrellas, Monica S. Lam, and John L. Hennessy. False Sharing and Spatial Locality in Multiprocessor Caches. IEEE Transactions on Computers, 43(6):651-663, June 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. TuE93 Dean M. Tullsen and Susan J. Eggers. Lxmitations of Cache Prefetching on a Bus-Based Multiprocessor. In Proceedings oJ the 20th International Sympostum on Computer Architecture, pp. 278-288, May 1993 Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. WSH93 Steven Cameron Woo, Jaswmder Pal Singh, and John L. Hennessy. The Performance Advantages of Integrating Message Passing m Cache-Coherent Multiprocessors Stanford University Technical Report No. CSL-TR-93-593, December 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. WSH94 Steven Cameron Woo, Jaswinder Pal Singh, and John L. Hennessy. The Performance Advantages of Integrating Block Data Transfer in Cache-Coherent Multiprocessors. In Proceedings of the Sixth International Conterence on Architectural Support for Programming Languages and Operating Systems (ASP- LOS- VI), pp. 219-229, October 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. The SPLASH-2 programs: characterization and methodological considerations

              Recommendations

              Comments

              Login options

              Check if you have access through your login credentials or your institution to get full access on this article.

              Sign in

              Full Access

              • Published in

                cover image ACM SIGARCH Computer Architecture News
                ACM SIGARCH Computer Architecture News  Volume 23, Issue 2
                Special Issue: Proceedings of the 22nd annual international symposium on Computer architecture (ISCA '95)
                May 1995
                412 pages
                ISSN:0163-5964
                DOI:10.1145/225830
                Issue’s Table of Contents
                • cover image ACM Conferences
                  ISCA '95: Proceedings of the 22nd annual international symposium on Computer architecture
                  July 1995
                  426 pages
                  ISBN:0897916980
                  DOI:10.1145/223982

                Copyright © 1995 ACM

                Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

                Publisher

                Association for Computing Machinery

                New York, NY, United States

                Publication History

                • Published: 1 May 1995

                Check for updates

                Qualifiers

                • article

              PDF Format

              View or Download as a PDF file.

              PDF

              eReader

              View online with eReader.

              eReader