Modeling a Million-Node Slim Fly Network Using Parallel Discrete-Event Simulation

Authors:
Noah Wolfe

Rensselaer Polytechnic Institute, Troy, NY, USA

Rensselaer Polytechnic Institute, Troy, NY, USA
View Profile

,
Christopher D. Carothers

Rensselaer Polytechnic Institute, Troy, NY, USA

Rensselaer Polytechnic Institute, Troy, NY, USA
View Profile

,
Misbah Mubarak

Argonne National Laboratory, Lemont, IL, USA

Argonne National Laboratory, Lemont, IL, USA
View Profile

,
Robert Ross

Argonne National Laboratory, Lemont, IL, USA

Argonne National Laboratory, Lemont, IL, USA
View Profile

,
Philip Carns

Argonne National Laboratory, Lemont, IL, USA

Argonne National Laboratory, Lemont, IL, USA
View Profile

SIGSIM-PADS '16: Proceedings of the 2016 ACM SIGSIM Conference on Principles of Advanced Discrete SimulationMay 2016Pages 189–199https://doi.org/10.1145/2901378.2901389

Published:15 May 2016Publication History

SIGSIM-PADS '16: Proceedings of the 2016 ACM SIGSIM Conference on Principles of Advanced Discrete Simulation

Pages 189–199

ABSTRACT

As supercomputers close in on exascale performance, the increased number of processors and processing power translates to an increased demand on the underlying network interconnect. The Slim Fly network topology, a new lowdiameter and low-latency interconnection network, is gaining interest as one possible solution for next-generation supercomputing interconnect systems. In this paper, we present a high-fidelity Slim Fly it-level model leveraging the Rensselaer Optimistic Simulation System (ROSS) and Co-Design of Exascale Storage (CODES) frameworks. We validate our Slim Fly model with the Kathareios et al. Slim Fly model results provided at moderately sized network scales. We further scale the model size up to n unprecedented 1 million compute nodes; and through visualization of network simulation metrics such as link bandwidth, packet latency, and port occupancy, we get an insight into the network behavior at the million-node scale. We also show linear strong scaling of the Slim Fly model on an Intel cluster achieving a peak event rate of 36 million events per second using 128 MPI tasks to process 7 billion events. Detailed analysis of the underlying discrete-event simulation performance shows how the million-node Slim Fly model simulation executes in 198 seconds on the Intel cluster.

References

B. Acun, N. Jain, A. Bhatele, M. Mubarak, C. Carothers, and L. Kale. Preliminary evaluation of a parallel trace replay tool for hpc network simulations. In S. Hunold, A. Costan, D. Giménez, A. Iosup, L. Ricci, M. E. Gómez Requena, V. Scarano, A. L. Varbanescu, S. L. Scott, S. Lankes, J. Weidendorfer, and M. Alexander, editors, Euro-Par 2015: Parallel Processing Workshops, volume 9523 of Lecture Notes in Computer Science, pages 417--429. Springer International Publishing, 2015.Google Scholar
A. Alexandrov, M. F. Ionescu, K. E. Schauser, and C. Scheiman. Loggp: Incorporating long messages into the logp model\?one step closer towards a realistic model for parallel computation. In Proceedings of the Seventh Annual ACM Symposium on Parallel Algorithms and Architectures, SPAA '95, pages 95--105, New York, NY, USA, 1995. ACM. Google ScholarDigital Library
P. D. Barnes, Jr., C. D. Carothers, D. R. Jefferson, and J. M. LaPre. Warp speed: Executing time warp on 1,966,080 cores. In Proceedings of the 1st ACM SIGSIM Conference on Principles of Advanced Discrete Simulation, SIGSIM PADS '13, pages 327--336, New York, NY, USA, 2013. ACM. Google ScholarDigital Library
M. Besta and T. Hoefler. Slim Fly: A Cost Effective Low-Diameter Network Topology. Nov. 2014. Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis (SC14). Google ScholarDigital Library
A. Bhatele. Task mapping on complex computer network topologies for improved performance. Technical report, LDRD Final Report, Lawrence Livermore National Laboratory, Oct. 2015. LLNL-TR-678732.Google Scholar
C. D. Carothers, D. Bauer, and S. Pearce. Ross: A high-performance, low memory, modular time warp system. In Proceedings of the Fourteenth Workshop on Parallel and Distributed Simulation, PADS '00, pages 53--60, Washington, DC, USA, 2000. IEEE Computer Society. Google ScholarDigital Library
C. D. Carothers, K. S. Perumalla, and R. M. Fujimoto. Efficient optimistic parallel simulations using reverse computation. ACM Trans. Model. Comput. Simul., 9(3):224--253, July 1999. Google ScholarDigital Library
CCI. Rsa cluster, Nov. 2014.Google Scholar
J. Cope, L. N., L. S., C. P., C. C. D., and R. R. Codes: Enabling co-design of multilayer exascale storage architectures. In Proceedings of the Workshop on Emerging Supercomputing Technologies (WEST), Tuscon, AZ, USA, 2011.Google Scholar
W. Dally. Virtual-channel flow control. Parallel and Distributed Systems, IEEE Transactions on, 3(2):194--205, Mar 1992. Google ScholarDigital Library
W. Dally and B. Towles. Principles and Practices of Interconnection Networks. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 2003. Google ScholarDigital Library
P. R. Hafner. Geometric realisation of the graphs of mckay-miller-siran. Journal of Combinatorial Theory, Series B, 90(2):223--232, 2004. Google ScholarDigital Library
Intel. Ushering in a new era: Argonne national laboratory's aurora system. Technical report, Intel Corporation, April 2015.Google Scholar
G. Kathareios, C. Minkenberg, B. Prisacari, G. Rodriguez, and T. Hoefler. Cost-Effective Diameter-Two Topologies: Analysis and Evaluation. Nov. 2015. Accepted at IEEE/ACM International Conference on High Performance Computing, Networking, Storage and Analysis (SC15). Google ScholarDigital Library
N. Liu, A. Haider, X.-H. Sun, and D. Jin. Fattreesim: Modeling large-scale fat-tree networks for hpc systems and data centers using parallel and discrete event simulation. In Proceedings of the 3rd ACM SIGSIM Conference on Principles of Advanced Discrete Simulation, SIGSIM PADS '15, pages 199--210, New York, NY, USA, 2015. ACM. Google ScholarDigital Library
B. D. McKay, M. Miller, and J. Siran. A note on large graphs of diameter two and given maximum degree. Journal of Combinatorial Theory, Series B, 74(1):110 -- 118, 1998. Google ScholarDigital Library
P. A. Merolla, J. V. Arthur, R. Alvarez-Icaza, A. S. Cassidy, J. Sawada, F. Akopyan, B. L. Jackson, N. Imam, C. Guo, Y. Nakamura, B. Brezzo, I. Vo, S. K. Esser, R. Appuswamy, B. Taba, A. Amir, M. D. Flickner, W. P. Risk, R. Manohar, and D. S. Modha. A million spiking-neuron integrated circuit with a scalable communication network and interface. Science, 345(6197):668--673, 2014.Google ScholarCross Ref
Miller, Mirka, Siran, and Jozef. Moore graphs and beyond: a survey of the degree/diameter problem. The Electronic Journal of Combinatorics {electronic only}, DS14:61 p., electronic only--61 p., electronic only, 2005.Google Scholar
M. Mubarak, C. D. Carothers, R. Ross, and P. Carns. Modeling a million-node dragonfly network using massively parallel discrete-event simulation. In Proceedings of the 2012 SC Companion: High Performance Computing, Networking Storage and Analysis, SCC '12, pages 366--376, Washington, DC, USA, 2012. IEEE Computer Society. Google ScholarDigital Library
M. Mubarak, C. D. Carothers, R. B. Ross, and P. Carns. A case study in using massively parallel simulation for extreme-scale torus network codesign. In Proceedings of the 2Nd ACM SIGSIM Conference on Principles of Advanced Discrete Simulation, SIGSIM PADS '14, pages 27--38, New York, NY, USA, 2014. ACM. Google ScholarDigital Library
D. M. Nicol. The cost of conservative synchronization in parallel discrete event simulations. J. ACM, 40(2):304--333, Apr. 1993. Google ScholarDigital Library
NVIDIA. Summit and sierra supercomputers: An inside look at the u.s. department of energy's new pre-exascale systems. Technical report, NVIDIA, November 2014.Google Scholar
M. Papka, P. Messina, R. Coffey, and C. Drugan. Argonne Leadership Computing Facility 2014 annual report. Mar 2015.Google Scholar
S. Snyder, P. Carns, J. Jenkins, K. Harms, R. Ross, M. Mubarak, and C. Carothers. A case for epidemic fault detection and group membership in hpc storage systems. In S. A. Jarvis, S. A. Wright, and S. D. Hammond, editors, High Performance Computing Systems. Performance Modeling, Benchmarking, and Simulation, volume 8966 of Lecture Notes in Computer Science, pages 237--248. Springer International Publishing, 2015.Google ScholarCross Ref
L. G. Valiant. A scheme for fast parallel communication. SIAM Journal on Computing, 11(2):350--361, 1982.Google ScholarDigital Library
S.-J. Wang. Load-balancing in multistage interconnection networks under multiple-pass routing. Journal of Parallel and Distributed Computing, 36(2):189 -- 194, 1996. Google ScholarDigital Library

Index Terms

Modeling a Million-Node Slim Fly Network Using Parallel Discrete-Event Simulation
1. Computing methodologies
  1. Modeling and simulation
    1. Model development and analysis
      1. Model verification and validation
    2. Simulation types and techniques
      1. Discrete-event simulation
  2. Parallel computing methodologies
    1. Parallel algorithms
2. Networks
  1. Network performance evaluation
    1. Network simulations

Recommendations

Modeling Large-Scale Slim Fly Networks Using Parallel Discrete-Event Simulation
Special Issue on PADS 2016

As supercomputers approach exascale performance, the increased number of processors translates to an increased demand on the underlying network interconnect. The slim fly network topology, a new low-diameter, low-latency, and low-cost interconnection ...
Read More
Modeling a Million-Node Dragonfly Network Using Massively Parallel Discrete-Event Simulation
SCC '12: Proceedings of the 2012 SC Companion: High Performance Computing, Networking Storage and Analysis

A low-latency and low-diameter interconnection network will be an important component of future exascale architectures. The dragonfly network topology, a two-level directly connected network, is a candidate for exascale architectures because of its low ...
Read More
Fit Fly: A Case Study on Interconnect Innovation through Parallel Simulation
SIGSIM-PADS '19: Proceedings of the 2019 ACM SIGSIM Conference on Principles of Advanced Discrete Simulation

To meet the demand for exascale-level performance from high-performance computing (HPC) interconnects, many system architects are turning to simulation results for accurate and reliable predictions of the performance of prospective technologies. Testing ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SIGSIM-PADS '16: Proceedings of the 2016 ACM SIGSIM Conference on Principles of Advanced Discrete Simulation
May 2016
272 pages
ISBN:9781450337427
DOI:10.1145/2901378
General Chairs:
Richard Fujimoto
Georgia Institute of Technology, Atlanta GA
,
Brian Unger
University of Calgary, Alberta Canada
,
Program Chair:
Christopher Carothers
Rensselaer Polytechnic Institute, Troy NY
Copyright © 2016 ACM
Publication rights licensed to ACM. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of the United States government. As such, the Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 15 May 2016
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
interconnection networks
network topologies
parallel discrete event simulation
slim fly
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate398of779submissions,51%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 18
  Total Citations
  View Citations
- 284
  Total Downloads
- Downloads (Last 12 months)60
- Downloads (Last 6 weeks)5
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Modeling a Million-Node Slim Fly Network Using Parallel Discrete-Event Simulation

SIGSIM-PADS '16: Proceedings of the 2016 ACM SIGSIM Conference on Principles of Advanced Discrete Simulation

ABSTRACT

References

Cited By

Index Terms

Recommendations

Modeling Large-Scale Slim Fly Networks Using Parallel Discrete-Event Simulation

Modeling a Million-Node Dragonfly Network Using Massively Parallel Discrete-Event Simulation

Fit Fly: A Case Study on Interconnect Innovation through Parallel Simulation