ACM Home Page
Please provide us with feedback. Feedback
Semantic-based distributed i/o with the paramedic framework
Full text PdfPdf (401 KB)
Source
High Performance Distributed Computing archive
Proceedings of the 17th international symposium on High performance distributed computing table of contents
Boston, MA, USA
SESSION: Storage and I/O table of contents
Pages 175-184  
Year of Publication: 2008
ISBN:978-1-59593-997-5
Authors
Pavan Balaji  Argonne National Laboratory, Argonne, IL, USA
Wuchun Feng  Virginia Tech, Blacksburg, VA, USA
Heshan Lin  North Carolina State University, Raleigh, NC, USA
Sponsors
ACM: Association for Computing Machinery
SIGARCH: ACM Special Interest Group on Computer Architecture
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 7,   Downloads (12 Months): 28,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
Save this Article to a Binder    Display Formats: BibTex  EndNote ACM Ref   
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1383422.1383444
What is a DOI?

ABSTRACT

Many large-scale applications simultaneously rely on multiple resources for efficient execution. For example, such applications may require both large compute and storage resources; however, very few supercomputing centers can provide large quantities of both. Thus, data generated at the compute site oftentimes has to be moved to a remote storage site for either storage or visualization and analysis. Clearly, this is not an efficient model, especially when the two sites are distributed over a wide-area network.

Thus, we present a framework called "ParaMEDIC: Parallel Metadata Environment for Distributed I/O and Computing" which uses application-specific semantic information to convert the generated data to orders-of-magnitude smaller metadata at the compute site, transfer the metadata to the storage site, and re-process the metadata at the storage site to regenerate the output. Specifically, ParaMEDIC trades a small amount of additional computation (in the form of data post-processing) for a potentially significant reduction in data that needs to be transferred in distributed environments.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
S. F. Altschul, T. L. Madden, A. A. Schaffer, J. Zhang, Z. Zhang, W. Miller, and D. J. Lipman. Gapped BLAST and PSI-BLAST: A New Generation of Protein Database Search Programs. Nucleic Acids Research, 25:3389--3402, 1997.
 
2
T. Baer and P. Wyckoff. A Parallel I/O Mechanism for Distributed Systems. In Cluster, 2004.
 
3
San Diego Supercomputing Center. Parallel 3D FFT Library. http://www.sdsc.edu/us/resources/p3dfft.php.
 
4
A. Darling, L. Carey, and W. Feng. The Design, Implementation, and Evaluation of mpiBLAST. In International Conference on Linux Clusters: The HPC Revolution, 2003.
 
5
J. Dean and S. Ghemawat. Mapreduce: Simplified data processing on large clusters. In OSDI, 2004.
 
6
I. Foster, D. Kohr, R. Krishnaiyer, and J. Mogill. Remote I/O: Fast access to distant storage. In Proceedings of the Fifth Workshop on I/O in Parallel and Distributed Systems, 1997.
 
7
Matteo Frigo and Steven G. Johnson. The design and implementation of FFTW3. Proceedings of the IEEE, 93(2):216--231, 2005.
 
8
M. Gardner, W. Feng, J. Archuleta, H. Lin, and X. Ma. Parallel Genomic Sequence-Searching on an Ad-Hoc Grid: Experiences, Lessons Learned, and Implications. In SC, 2006.
 
9
J. Lee, X. Ma, R. Ross, R. Thakur, and M. Winslett. RFS: Efficient and flexible remote file access for MPI-IO. In Cluster, 2004.
 
10
J. Lee, R. Ross, S. Atchley, M. Beck, and R. Thakur. MPI-IO/L: efficient remote i/o for mpi-io via logistical networking. In IPDPS, 2006.
 
11
TCP Linda. http://www.lindaspaces.com/products/linda_overview.html.
 
12
R. Thakur, W. Gropp, and E. Lusk. Data sieving and collective I/O in ROMIO. In Proceedings of the 7th Symposium on the Frontiers of Massively Parallel Computation, February 1999.

Collaborative Colleagues:
Pavan Balaji: colleagues
Wuchun Feng: colleagues
Heshan Lin: colleagues