skip to main content
10.1145/2597652.2597679acmconferencesArticle/Chapter ViewAbstractPublication PagesicsConference Proceedingsconference-collections
research-article

Supporting storage configuration for I/O intensive workflows

Published: 10 June 2014 Publication History

Abstract

System provisioning, resource allocation, and system configuration decisions for I/O-intensive workflow applications are complex even for expert users. Users face choices at multiple levels: allocating resources to individual sub-systems (e.g., the application layer, the storage layer) and configuring each of these optimally (e.g., replication level, chunk size, caching policies in case of storage) all having a large impact on overall application performance. This paper presents our progress on addressing the problem of supporting these provisioning, allocation and configuration decisions for workflow applications. To enable selecting a good choice in a reasonable time, we propose an approach that accelerates the exploration of the configuration space based on a low-cost performance predictor that estimates total execution time of a workflow application in a given setup. Our evaluation shows that: (i) the predictor is effective in identifying the desired system configuration, (ii) it can scale to model a workflow application run on an entire cluster, while (iii) using over 2000x less resources (machines x time) than running the actual application.

References

[1]
DiskSim. http://www.pdl.cmu.edu/DiskSim/.
[2]
The network simulator NS2. http://www.isi.edu/nsnam/ns/, 2012.
[3]
M. Abd-El-Malek, W. V. C. II, C. Cranor, G. R. Ganger, J. Hendricks, A. J. Klosterman, M. P. Mesnier, M. Prasad, B. Salmon, R. R. Sambasivan, S. Sinnamohideen, J. D. Strunk, E. Thereska, M. Wachs, and J. J. Wylie. Ursa minor: Versatile cluster-based storage. In Proc. of the Conf. on File and Storage Technologies, Dec. 2005.
[4]
S. Al-Kiswany, A. Gharaibeh, and M. Ripeanu. The case for a versatile storage system. SIGOPS Oper. Syst. Rev., 44:10--14, March 2010.
[5]
S. F. Altschul, W. Gish, W. Miller, E. W. Myers, and D. J. Lipman. Basic Local Alignment Search Tool. J. of Molecular Biology, 215(3):403--410, Oct. 1990.
[6]
E. Anderson, M. Hobbs, K. Keeton, S. Spence, M. Uysal, and A. Veitch. Hippodrome: Running circles around storage administration. In Proc. of the Conf. on File and Storage Technologies, pages 175--188, 2002.
[7]
E. Anderson, S. Spence, R. Swaminathan, M. Kallahalla, and Q. Wang. Quickly finding near-optimal storage designs. ACM Trans. Comput. Syst., 23(4):337--374, Nov 2005.
[8]
B. Behzad, H. V. T. Luu, J. Huchette, S. Byna, Prabhat, R. A. Aydt, Q. Koziol, and M. Snir. Taming Parallel I/O Complexity with Auto-Tuning. In SC, 2013.
[9]
S. Bharathi, A. Chervenak, E. Deelman, G. Mehta, M.-H. Su, and K. Vahi. Characterization of scientific workflows. In Workflows in Support of Large-Scale Science, 2008. WORKS 2008. 3rd Workshop on, pages 1--10, 2008.
[10]
F. Cappello, E. Caron, M. Daydé, F. Desprez, Y. Jégou, P. Primet, E. Jeannot, S. Lanteri, J. Leduc, N. Melab, G. Mornet, R. Namyst, B. Quetier, and O. Richard. Grid'5000: a large scale and highly reconfigurable grid experimental testbed. In Grid Comp.Proc of the 6th IEEE/ACM Intl. Workshop on, 2005.
[11]
L. B. Costa, S. Al-Kiswany, R. V. Lopes, and M. Ripeanu. Assessing data deduplication trade-offs from an energy and performance perspective. In 2011 Intl. Green Computing Conf. and Workshops, 2011.
[12]
L. B. Costa, A. Barros, S. Al-Kiswany, E. Vairavanathan, and M. Ripeanu. Predicting intermediate storage performance for workflow applications. CoRR, abs/1302.4760, 2013.
[13]
L. B. Costa, J. Brunet, L. Hattori, and M. Ripeanu. Experience on Applying Performance Prediction during Development: a Distributed Storage System Tale. Technical report, UBC/ECE/NetSysLab, Sep. 13. http://www.ece.ubc.ca/~lauroc/tr/tech2.pdf.
[14]
L. B. Costa and M. Ripeanu. Towards Automating the Configuration of a Distributed Storage System. In 11th ACM/IEEE Intl. Conf. on Grid Computing - Grid 2010, Oct. 2010.
[15]
I. F. Haddad. Pvfs: A parallel virtual file system for linux clusters. Linux Journal, 2000(80es), Nov. 2000.
[16]
A. C. Laity, N. Anagnostou, G. B. Berriman, J. C. Good, J. C. Jacob, D. S. Katz, and T. Prince. Montage: An Astronomical Image Mosaic Service for the NVO. In Astronomical Data Analysis Software and Systems XIV, volume 347 of Astronomical Society of the Pacific Conf. Series, page 34, Dec 2005.
[17]
N. Liu, C. Carothers, J. Cope, P. Carns, R. Ross, A. Crume, and C. Maltzahn. Modeling a leadership scale storage system. In Proc. of the 9th Intl. Conf. on Parallel Processing and Applied Mathematics - Vol. Part I, PPAM'11, pages 10--19, 2012.
[18]
N. Liu, J. Cope, P. Carns, C. Carothers, R. Ross, G. Grider, A. Crume, and C. Maltzahn. On the role of burst buffers in leadership class storage systems. In Mass Storage Systems and Technologies (MSST), 2012 IEEE 28th Symp. on, pages 1--11, 2012.
[19]
E. Molina-Estolano, C. Maltzahn, J. Bent, and S. Brandt. Building a Parallel File System Simulator. 180(1):012050, 2009.
[20]
A. Montresor and M. Jelasity. PeerSim: A scalable P2P simulator. In Proc. of the 9th Int. Conf. on Peer-to-Peer (P2P'09), pages 99--100, Sep 2009.
[21]
T. Shibata, S. Choi, and K. Taura. File-access patterns of data-intensive workflow applications and their implications to distributed filesystems. In Proc. of the 19th ACM Intl. Symp. on High Perf. Distributed Computing, HPDC '10, pages 746--755, 2010.
[22]
J. D. Strunk, E. Thereska, C. Faloutsos, and G. R. Ganger. Using utility to provision storage systems. In 6th USENIX Conf. on File and Storage Technologies, FAST, pages 313--328, 2008.
[23]
E. Thereska, M. Abd-El-Malek, J. J. Wylie, D. Narayanan, and G. R. Ganger. Informed datamdistribution selection in a self-predicting storagemsystem. In Proc. of the 3rd Intl. Conf. on AutonomicmComputing, pages 187--198, 2006.
[24]
E. Thereska, B. Salmon, J. D. Strunk, M. Wachs,M. Abd-El-Malek, J. Lopez, and G. R. Ganger.Stardust: tracking activity in a distributed storagesystem. In SIGMETRICS/Perf., pages 3--14, 2006.
[25]
E. Vairavanathan, S. Al-Kiswany, L. B. Costa,Z. Zhang, D. S. Katz, M. Wilde, and M. Ripeanu. Aworkflow-aware storage system: An opportunity study. In Cluster Computing and the Grid, IEEE Intl. Symp. on, pages 326--334, 2012.
[26]
A. Varga. Using the OMNeT++ Discrete Event Simulation System in Education. Education, IEEETrans. on, 42(4), 1999.
[27]
M. Wilde, M. Hategan, J. M. Wozniak, B. Clifford, D. S. Katz, and I. T. Foster. Swift: A language for distributed parallel scripting. Parallel Computing, 37(9):633--652, 2011.
[28]
J. M. Wozniak and M. Wilde. Case studies in storage access by loosely coupled petascale applications. InProc. of the 4th Workshop on Petascale Data Storage,PDSW'09, pages 16--20, 2009.
[29]
Z. Zhang, D. S. Katz, M. Wilde, J. M. Wozniak, and I. Foster. MTC Envelope: Defining the Capability ofLarge Scale Computers in the Context of ParallelScripting Applications. In Proc. of the 22Nd Intl. Symp. on High Perf. Parallel and Distributed Computing, HPDC'13, pages 37--48, 2013.
[30]
Z. Zhang, D. S. Katz, J. M. Wozniak, A. Espinosa, and I. Foster. Design and analysis of data management in scalable parallel scripting. In Proc. of the Intl. Conf. on High Performance Computing, Networking, Storage and Analysis, SC'12, pages 85:1--85:11, 2012.

Cited By

View all
  • (2018)A Checkpoint of Research on Parallel I/O for High-Performance ComputingACM Computing Surveys10.1145/315289151:2(1-35)Online publication date: 12-Mar-2018
  • (2017)Runtime Performance Prediction of Big Data Workflows with I/O-aware SimulationProceedings of the 11th EAI International Conference on Performance Evaluation Methodologies and Tools10.1145/3150928.3150943(74-81)Online publication date: 5-Dec-2017
  • (2017)Toward Managing HPC Burst Buffers Effectively: Draining Strategy to Regulate Bursty I/O Behavior2017 IEEE 25th International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS)10.1109/MASCOTS.2017.35(87-98)Online publication date: Sep-2017
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ICS '14: Proceedings of the 28th ACM international conference on Supercomputing
June 2014
378 pages
ISBN:9781450326421
DOI:10.1145/2597652
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 10 June 2014

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. distributed storage systems
  2. performance prediction

Qualifiers

  • Research-article

Funding Sources

Conference

ICS'14
Sponsor:

Acceptance Rates

ICS '14 Paper Acceptance Rate 34 of 160 submissions, 21%;
Overall Acceptance Rate 629 of 2,180 submissions, 29%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)2
  • Downloads (Last 6 weeks)2
Reflects downloads up to 06 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2018)A Checkpoint of Research on Parallel I/O for High-Performance ComputingACM Computing Surveys10.1145/315289151:2(1-35)Online publication date: 12-Mar-2018
  • (2017)Runtime Performance Prediction of Big Data Workflows with I/O-aware SimulationProceedings of the 11th EAI International Conference on Performance Evaluation Methodologies and Tools10.1145/3150928.3150943(74-81)Online publication date: 5-Dec-2017
  • (2017)Toward Managing HPC Burst Buffers Effectively: Draining Strategy to Regulate Bursty I/O Behavior2017 IEEE 25th International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS)10.1109/MASCOTS.2017.35(87-98)Online publication date: Sep-2017
  • (2017)A cross-layer optimized storage system for workflow applicationsFuture Generation Computer Systems10.1016/j.future.2017.02.03875(423-437)Online publication date: Oct-2017
  • (2016)Dynamic Process Migration Based on Block Access Patterns Occurring in Storage ServersACM Transactions on Architecture and Code Optimization10.1145/289900213:2(1-20)Online publication date: 14-Jun-2016
  • (2014)Experience with using a performance predictor during developmentProceedings of the 2nd International workshop on Software Engineering for High Performance Computing in Computational Science and Engineering10.1109/SE-HPCCSE.2014.6(13-19)Online publication date: 16-Nov-2014

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media