skip to main content
10.1145/2828612.2828622acmconferencesArticle/Chapter ViewAbstractPublication PagesscConference Proceedingsconference-collections
research-article
Public Access

Lessons Learned from Building In Situ Coupling Frameworks

Published: 15 November 2015 Publication History

Abstract

Over the past few years, the increasing amounts of data produced by large-scale simulations have motivated a shift from traditional offline data analysis to in situ analysis and visualization. In situ processing began as the coupling of a parallel simulation with an analysis or visualization library, motivated primarily by avoiding the high cost of accessing storage. Going beyond this simple pairwise tight coupling, complex analysis workflows today are graphs with one or more data sources and several interconnected analysis components. In this paper, we review four tools that we have developed to address the challenges of coupling simulations with visualization packages or analysis workflows: Damaris, Decaf, FlowVR and Swift. This self-critical inquiry aims to shed light not only on their potential, but most importantly on the forthcoming software challenges that these and other in situ analysis and visualization frameworks will face in order to move toward exascale.

References

[1]
H. Abbasi, J. Lofstead, F. Zheng, K. Schwan, M. Wolf, and S. Klasky. Extending I/O Through High Performance Data Services. In Proceedings of the IEEE International Conference on Cluster Computing and Workshops (CLUSTER '09), Sept. 2009.
[2]
I. Altintas, C. Berkley, E. Jaeger, M. Jones, B. Ludascher, and S. Mock. Kepler: an Extensible System for Design and Execution of Scientific Workflows. In Scientific and Statistical Database Management, 2004. Proceedings. 16th International Conference on, pages 423--424, June 2004.
[3]
T. G. Armstrong, J. M. Wozniak, M. Wilde, and I. T. Foster. Compiler Techniques for Massively Scalable Implicit Task Parallelism. In Proc. SC, 2014.
[4]
D. Beazley. Automated scientific software scripting with SWIG. Future Generation Computer Systems, 19(5):599--609, 2003.
[5]
J. Biddiscombe, J. Soumagne, G. Oger, D. Guibert, and J.-G. Piccinali. Parallel Computational Steering and Analysis for HPC Applications using a ParaView Interface and the HDF5 DSM Virtual File Driver. In T. Kuhlen, R. Pajarola, and K. Zhou, editors, Eurographics Symposium on Parallel Graphics and Visualization. The Eurographics Association, 2011.
[6]
E. Deelman, G. Singh, M.-H. Su, J. Blythe, Y. Gil, C. Kesselman, G. Mehta, K. Vahi, G. B. Berriman, J. Good, A. Laity, J. C. Jacob, and D. S. Katz. Pegasus: A Framework for Mapping Complex Scientific Workflows Onto Distributed Systems. Sci. Program., 13(3):219--237, July 2005.
[7]
M. Dorier, G. Antoniu, F. Cappello, M. Snir, and L. Orf. Damaris: How to Efficiently Leverage Multicore Parallelism to Achieve Scalable, Jitter-free I/O. In Proceedings of the IEEE International Conference on Cluster Computing (CLUSTER '12), Beijing, China, Sept. 2012. IEEE.
[8]
M. Dorier, R. Sisneros, Roberto, T. Peterka, G. Antoniu, and B. Semeraro, Dave. Damaris/Viz: a Nonintrusive, Adaptable and User-Friendly In Situ Visualization Framework. In Proceedings of the IEEE Symposium on Large-Scale Data Analysis and Visualization (LDAV '13), Atlanta, Georgia, USA, Oct. 2013.
[9]
M. Dreher, M. Piuzzi, T. Ahmed, C. Matthieu, M. Baaden, N. Férey, S. Limet, B. Raffin, and S. Robert. Interactive Molecular Dynamics: Scaling up to Large Systems. In International Conference on Computational Science, ICCS 2013, Barcelone, Spain, June 2013. Elsevier.
[10]
M. Dreher, J. Prevoteau-Jonquet, M. Trellet, M. Piuzzi, M. Baaden, B. Raffin, N. Ferey, S. Robert, and S. Limet. ExaViz: a Flexible Framework to Analyse, Steer and Interact with Molecular Dynamics Simulations. Faraday Discuss., 169:119--142, 2014.
[11]
M. Dreher and B. Raffin. A Flexible Framework for Asynchronous In Situ and In Transit Analytics for Scientific Simulations. In Cluster, Cloud and Grid Computing (CCGrid), 2014 14th IEEE/ACM International Symposium on, pages 277--286, May 2014.
[12]
F. R. Duro, J. G. Blas, F. Isaila, J. Carretero, J. M. Wozniak, and R. Ross. Exploiting Data Locality in Swift/T Workflows using Hercules. In Proc. NESUS Workshop, 2014.
[13]
N. Fabian, K. Moreland, D. Thompson, A. Bauer, P. Marion, B. Geveci, M. Rasquin, and K. Jansen. The ParaView Coprocessing Library: A Scalable, General Purpose In Situ Visualization Library. In Proceedings of the IEEE Symposium on Large-Scale Data Analysis and Visualization (LDAV '11), 2011.
[14]
J. Goecks, A. Nekrutenko, J. Taylor, et al. Galaxy: a Comprehensive Approach for Supporting Accessible, Reproducible, and Transparent Computational Research in the Life Sciences. Genome Biol, 11(8):R86, 2010.
[15]
M. Hereld, M. E. Papka, and V. Vishwanath. Toward Simulation-Time Data Analysis and I/O Acceleration on Leadership-Class Systems. In Proceeding of the IEEE Symposium on Large-Scale Data Analysis and Visualization (LDAV '11), Providence, RI, 10/2011 2011.
[16]
R. V. Kassick, F. Z. Boito, P. Navaux, and Y. Denneulin. Investigating I/O approaches to improve performance and scalability of the Ocean-Land-Atmosphere Model. Presentation at the Seventh Workshop of the Joint INRIA/UIUC Laboratory for Petascale Computing, 2012.
[17]
E. L. Lusk, S. C. Pieper, and R. M. Butler. More Scalability, Less Pain: A Simple Programming Model and its Implementation for Extreme Computing. SciDAC Review, 17, 2010.
[18]
J. McFarland. FortWrap web site. http://fortwrap.sourceforge.net.
[19]
Message Passing Interface Forum. MPI: A Message-Passing Interface Standard, 1994.
[20]
ParaView. Catalyst. http://catalyst.paraview.org.
[21]
F. Shahzad, M. Wittmann, M. Kreutzer, T. Zeiser, G. Hager, and G. Wellein. A survey of checkpoint/restart techniques on distributed memory systems. Parallel Processing Letters, 23(04), 2013.
[22]
S. Van Der Walt, S. C. Colbert, and G. Varoquaux. The NumPy array: A structure for efficient numerical computation. Computing in Science & Engineering, 13(2), 2011.
[23]
G. von Laszewski, I. Foster, J. Gawor, and P. Lane. A Java Commodity Grid Kit. Concurrency and Computation: Practice and Experience, 13(8--9), 2001.
[24]
B. Whitlock, J. M. Favre, and J. S. Meredith. Parallel In Situ Coupling of Simulation with a Fully Featured Visualization System. In Proceedings of the Eurographics Symposium on Parallel Graphics and Visualization (EGPGV '10). Eurographics Association, 2011.
[25]
M. Wilde, M. Hategan, J. M. Wozniak, B. Clifford, D. S. Katz, and I. Foster. Swift: A Language for Distributed Parallel Scripting. Parallel Computing, 37(9), 2011.
[26]
K. Wolstencroft, R. Haines, D. Fellows, A. Williams, D. Withers, S. Owen, S. Soiland-Reyes, I. Dunlop, A. Nenadic, P. Fisher, J. Bhagat, K. Belhajjame, F. Bacall, A. Hardisty, A. Nieva de la Hidalga, M. P. Balcazar Vargas, S. Sufi, and C. Goble. The Taverna Workflow Suite: Designing and Executing Workflows of Web Services on the Desktop, Web or in the Cloud. Nucleic Acids Research, 41(W1):W557--W561, 2013.
[27]
J. M. Wozniak, T. G. Armstrong, D. S. Katz, M. Wilde, and I. T. Foster. Toward Computational Experiment Management via Multi-Language Applications. In DOE Workshop on Software Productivity for eXtreme scale Science (SWP4XS), 2014.
[28]
J. M. Wozniak, T. G. Armstrong, K. Maheshwari, E. L. Lusk, D. S. Katz, M. Wilde, and I. T. Foster. Turbine: A Distributed-memory Dataflow Engine for High Performance Many-task Applications. Fundamenta Informaticae, 28(3), 2013.
[29]
J. M. Wozniak, T. G. Armstrong, K. C. Maheshwari, D. S. Katz, M. Wilde, and I. T. Foster. Toward interlanguage parallel scripting for distributed-memory scientific computing. In Proc. CLUSTER, 2015.
[30]
J. M. Wozniak, T. G. Armstrong, M. Wilde, D. S. Katz, E. Lusk, and I. T. Foster. Swift/T: Scalable Data Flow Programming for Distributed-memory Task-parallel Applications. In Proc. CCGrid, 2013.
[31]
J. M. Wozniak, T. Peterka, T. G. Armstrong, J. Dinan, E. L. Lusk, M. Wilde, and I. T. Foster. Dataflow coordination of data-parallel tasks via MPI 3.0. In Proc. Recent Advances in Message Passing Interface (EuroMPI), 2013.
[32]
J. M. Wozniak, H. Sharma, T. G. Armstrong, M. Wilde, J. D. Almer, and I. Foster. Big Data Staging with MPI-IO for Interactive X-ray Science. In Proc. Big Data Computing, 2014.
[33]
J. M. Wozniak, M. Wilde, and I. T. Foster. Language Features for Scalable Distributed-memory Dataflow Computing. In Proc. Data-Flow Execution Models for Extreme-Scale Computing at PACT, 2014.
[34]
O. Yildiz, M. Dorier, S. Ibrahim, and G. Antoniu. A Performance and Energy Analysis of I/O Management Approaches for Exascale Systems. In Proceedings of the Sixth International Workshop on Data Intensive Distributed Computing (DIDC '14), pages 35--40, New York, NY, USA, 2014. ACM.
[35]
F. Zhang, M. Parashar, C. Docan, S. Klasky, N. Podhorszki, and H. Abbasi. Enabling In-situ Execution of Coupled Scientific Workflow on Multi-core Platform. In Proceedings of the IEEE International Parallel and Distributed Processing Symposium (IPDPS '12). IEEE, 2012.
[36]
Y. Zhao, J. Dobson, I. Foster, L. Moreau, and M. Wilde. A notation and system for expressing and executing cleanly typed workflows on messy scientific data. SIGMOD Record, 34(3), 2005.
[37]
Y. Zhao, M. Hategan, B. Clifford, I. Foster, G. von Laszewski, I. Raicu, T. Stef-Praun, and M. Wilde. Swift: Fast, Reliable, Loosely Coupled Parallel Computation. In Proc. Workshop on Scientific Workflows, 2007.
[38]
F. Zheng, H. Abbasi, C. Docan, J. Lofstead, Q. Liu, S. Klasky, M. Parashar, N. Podhorszki, K. Schwan, and M. Wolf. PreDatA -- Preparatory Data Analytics on Peta-Scale Machines. In Proceedings of the IEEE International Symposium on Parallel Distributed Processing (IPDPS '10), 2010.

Cited By

View all
  • (2022)Enabling Global MPI Process Addressing in MPI ApplicationsProceedings of the 29th European MPI Users' Group Meeting10.1145/3555819.3555829(27-36)Online publication date: 14-Sep-2022
  • (2022)Accelerating Scientific Workflows on HPC Platforms with In Situ Processing2022 22nd IEEE International Symposium on Cluster, Cloud and Internet Computing (CCGrid)10.1109/CCGrid54584.2022.00009(1-10)Online publication date: May-2022
  • (2022)Decaf: Decoupled Dataflows for In Situ WorkflowsIn Situ Visualization for Computational Science10.1007/978-3-030-81627-8_7(137-158)Online publication date: 5-May-2022
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ISAV2015: Proceedings of the First Workshop on In Situ Infrastructures for Enabling Extreme-Scale Analysis and Visualization
November 2015
51 pages
ISBN:9781450340038
DOI:10.1145/2828612
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 November 2015

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Coupling
  2. Damaris
  3. Decaf
  4. Exascale
  5. FlowVR
  6. In Situ Visualization
  7. Simulation
  8. Swift

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Funding Sources

Conference

SC15

Acceptance Rates

ISAV2015 Paper Acceptance Rate 8 of 19 submissions, 42%;
Overall Acceptance Rate 23 of 63 submissions, 37%

Upcoming Conference

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)97
  • Downloads (Last 6 weeks)6
Reflects downloads up to 05 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2022)Enabling Global MPI Process Addressing in MPI ApplicationsProceedings of the 29th European MPI Users' Group Meeting10.1145/3555819.3555829(27-36)Online publication date: 14-Sep-2022
  • (2022)Accelerating Scientific Workflows on HPC Platforms with In Situ Processing2022 22nd IEEE International Symposium on Cluster, Cloud and Internet Computing (CCGrid)10.1109/CCGrid54584.2022.00009(1-10)Online publication date: May-2022
  • (2022)Decaf: Decoupled Dataflows for In Situ WorkflowsIn Situ Visualization for Computational Science10.1007/978-3-030-81627-8_7(137-158)Online publication date: 5-May-2022
  • (2022)A Simulation-Oblivious Data Transport Model for Flexible In Transit VisualizationIn Situ Visualization for Computational Science10.1007/978-3-030-81627-8_18(399-419)Online publication date: 5-May-2022
  • (2021)A Scalability Study of Data Exchange in HPC Multi-component Workflows2021 IEEE International Conference on Cluster Computing (CLUSTER)10.1109/Cluster48925.2021.00099(642-648)Online publication date: Sep-2021
  • (2020)The Challenges of In Situ Analysis for Multiple SimulationsISAV'20 In Situ Infrastructures for Enabling Extreme-Scale Analysis and Visualization10.1145/3426462.3426468(32-37)Online publication date: 12-Nov-2020
  • (2020)Benesh: a Programming Model for Coupled Scientific Workflows2020 IEEE/ACM Fifth International Workshop on Extreme Scale Programming Models and Middleware (ESPM2)10.1109/ESPM251964.2020.00008(1-9)Online publication date: Nov-2020
  • (2019)Scheduling on Two Unbounded Resources with Communication CostsEuro-Par 2019: Parallel Processing10.1007/978-3-030-29400-7_9(117-128)Online publication date: 26-Aug-2019
  • (2018)The future of scientific workflowsInternational Journal of High Performance Computing Applications10.5555/3195474.319547732:1(159-175)Online publication date: 1-Jan-2018
  • (2018)Toward Understanding I/O Behavior in HPC Workflows2018 IEEE/ACM 3rd International Workshop on Parallel Data Storage & Data Intensive Scalable Computing Systems (PDSW-DISCS)10.1109/PDSW-DISCS.2018.00012(64-75)Online publication date: Nov-2018
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media