research-article

Random Slicing: Efficient and Scalable Data Placement for Large-Scale Storage Systems

Authors:
Alberto Miranda

Barcelona Supercomputing Center, Barcelona, Spain

Barcelona Supercomputing Center, Barcelona, Spain
View Profile

,
Sascha Effert

Christmann Informationstechnik + Medien, Ilsede, Germany

Christmann Informationstechnik + Medien, Ilsede, Germany
View Profile

,
Yangwook Kang

University of California, Santa Cruz, CA

University of California, Santa Cruz, CA
View Profile

,
Ethan L. Miller

University of California, Santa Cruz, CA

University of California, Santa Cruz, CA
View Profile

,
Ivan Popov

Johannes Gutenberg-University, Mainz, Germany

Johannes Gutenberg-University, Mainz, Germany
View Profile

,
Andre Brinkmann

Johannes Gutenberg-University, Mainz, Germany

Johannes Gutenberg-University, Mainz, Germany
View Profile

,
Tom Friedetzky

Durham University, Durham, England, U.K.

Durham University, Durham, England, U.K.
View Profile

,
Toni Cortes

Barcelona Supercomputing Center and Technical University of Catalonia, Barcelona, Spain

Barcelona Supercomputing Center and Technical University of Catalonia, Barcelona, Spain
View Profile

Authors Info & Claims

ACM Transactions on Storage Volume 10 Issue 3Article No.: 9pp 1–35https://doi.org/10.1145/2632230

Published:07 August 2014Publication History

ACM Transactions on Storage

Abstract

The ever-growing amount of data requires highly scalable storage solutions. The most flexible approach is to use storage pools that can be expanded and scaled down by adding or removing storage devices. To make this approach usable, it is necessary to provide a solution to locate data items in such a dynamic environment. This article presents and evaluates the Random Slicing strategy, which incorporates lessons learned from table-based, rule-based, and pseudo-randomized hashing strategies and is able to provide a simple and efficient strategy that scales up to handle exascale data. Random Slicing keeps a small table with information about previous storage system insert and remove operations, drastically reducing the required amount of randomness while delivering a perfect load distribution.

References

S. Amarasinghe, D. Campbell, W. Carlson, A. Chien, W. Dally, E. Elnohazy, M. Hall, et al. 2010. ExaScale software study: Software challenges in extreme scale systems. Tech. Rep., sponsored by DARPA IPTO in the context of the ExaScale Computing Study.Google Scholar
A. Azagury, V. Dreizin, M. Factor, E. Henis, D. Naor, Y. Rinetzky, O. Rodeh, J. Satran, A. Tavory, and L. Yerushalmi. 2003. Towards an object store. In Proceedings of the 20th IEEE Conference on Mass Storage Systems and Technologies (MSST). 165--176. Google ScholarDigital Library
Y. Azar, A. Broder, A. Karlin, and E. Upfal. 1999. Balanced allocations. SIAM J. Comput. 29, 1, 180--200. Google ScholarDigital Library
J. L. Bentley. 1977. Solutions to Klees rectangle problems. Tech. Rep. Carnegie-Mellon University, Pittsburgh, PA.Google Scholar
M. Blaum, J. Brady, J. Bruck, and J. Menon. 1994. EVENODD: An optimal scheme for tolerating double disk failures in RAID architectures. In Proceedings of the 21st International Symposium on Computer Architecture (ISCA). 245--254. Google ScholarDigital Library
R. P. Brent. 1992. Uniform random number generators for supercomputers. In Proceedings of the 5th Australian Supercomputer Conference. 95--104.Google Scholar
A. Brinkmann and S. Effert. 2008. Redundant data placement strategies for cluster storage environments. In Proceedings of the 12th International Conference on Principles of DIstributed Systems (OPODIS). Google ScholarDigital Library
A. Brinkmann, S. Effert, F. Meyer Auf Der Heide, and C. Scheideler. 2007. Dynamic and redundant data placement. In Proceedings of the 27th IEEE International Conference on Distributed Computing Systems (ICDCS). Google ScholarDigital Library
A. Brinkmann, M. Heidebuer, F. Meyer Auf Der Heide, U. Rückert, K. Salzwedel, and M. Vodisek. 2004. V: Drive - Costs and Benefits of an Out-of-Band Storage Virtualization System. In Proceedings of the 21st IEEE Conference on Mass Storage Systems and Technologies (MSST). 153--157.Google Scholar
A. Brinkmann, K. Salzwedel, and C. Scheideler. 2000. Efficient, distributed data placement strategies for storage area networks. In Proceedings of the 12th ACM Symposium on Parallel Algorithms and Architectures (SPAA). 119--128. Google ScholarDigital Library
A. Brinkmann, Kay Salzwedel, and C. Scheideler. 2002. Compact, adaptive placement schemes for non-uniform distribution requirements. In Proceedings of the 14th ACM Symposium on Parallel Algorithms and Architectures (SPAA). 53--62. Google ScholarDigital Library
W. S. Chou. 1995. On inversive maximal period polynomials over finite fields. Appl. Algeb. Eng. Commun. Comput. 6, 4--5, 245--250.Google Scholar
P. Corbett, B. English, A. Goel, T. Grcanac, S. Kleiman, J. Leong, and S. Sankar. 2004. Row-diagonal parity for double disk failure correction. In Proceedings of the 3rd USENIX Conference on File and Storage Technologies (FAST). 1--14. Google ScholarDigital Library
T. Cortes and J. Labarta. 2001. Extending heterogeneity to RAID level 5. In Proceedings of the USENIX Annual Technical Conference. 119--132. Google ScholarDigital Library
M. De Berg, O. Cheong, and M. Van Kreveld. 2008. Computational Geometry: Algorithms and Applications. Springer. Google ScholarDigital Library
G. DeCandia, D. Hastorun, M. Jampani, G. Kakulapati, A. Lakshman, A. Pilchin, S. Sivasubramanian, P. Vosshall, and W. Vogels. 2007. Dynamo: Amazon's highly available key-value store. ACM SIGOPS Oper. Syst. Rev. 41, 6, 205--220. Google ScholarDigital Library
A. Devulapalli, D. Dalessandro, and P. Wyckoff. 2008. Data structure consistency using atomic operations in storage devices. In Proceedings of the 5th International Workshop on Storage Network Architecture and Parallel I/Os (SNAPI). 65--73. Google ScholarDigital Library
D. Eastlake and P. Jones. 2001. US secure hash algorithm 1 (SHA1).Google Scholar
J. Gonzalez and T. Cortes. 2008. Distributing orthogonal redundancy on adaptive disk arrays. In Proceedings of the International Conference on Grid Computing, High-Performance and Distributed Applications (GADA). Google ScholarDigital Library
R. J. Honicky and E. L. Miller. 2003. A fast algorithm for online placement and reorganization of replicated data. In Proceedings of the 17th IEEE International Parallel and Distributed Processing Symposium (IPDPS). Google ScholarDigital Library
R. J. Honicky and E. L. Miller. 2004. Replication under scalable hashing: A family of algorithms for scalable decentralized data distribution. In Proceedings of the 18th IEEE International Parallel and Distributed Processing Symposium (IPDPS).Google Scholar
N. L. Johnson and S. Kotz. 1977. Urn Models and Their Applications. Wiley, New York.Google Scholar
D. Karger, E. Lehman, T. Leighton, M. Levine, D. Lewin, and R. Panigrahy. 1997. Consistent hashing and random trees: Distributed caching protocols for relieving hot spots on the world wide web. In Proceedings of the 29th ACM Symposium on Theory of Computing (STOC). 654--663. Google ScholarDigital Library
D. E. Knuth. 1997. Volume 2: Seminumerical Algorithms. In the Art of Computer Programming, 192.Google Scholar
D. H. Lehmer. 1951. Mathematical methods in large-scale computing units. Ann. Comput. Lab. Harvard Univ. 26, 141--146.Google Scholar
M. Luby. 1996. Pseudorandomness and Cryptographic Applications. Princeton University Press. Google ScholarDigital Library
G. Marsaglia and A. Zaman. 1991. A new class of random number generators. Ann. Appl. Probab. 462--480.Google Scholar
M. Mense and C. Scheideler. 2008. SPREAD: An adaptive scheme for redundant and fair storage in dynamic heterogeneous storage systems. In Proceedings of the 19th ACM-SIAM Symposium on Discrete Algorithms (SODA). Google ScholarDigital Library
M. Mitzenmacher. 1996. The power of two choices in randomized load balancing. Ph.D. thesis. Computer Science Department, University of California, Berkeley. Google ScholarDigital Library
D. A. Patterson, G. Gibson, and R. H. Katz. 1988. A case for redundant arrays of inexpensive disks (RAID). In Proceedings of the ACM Conference on Management of Data (SIGMOD). 109--116. Google ScholarDigital Library
I. Popov, A. Brinkmann, and T. Friedetzky. 2012. On the influence of PRNGs on data distribution. In Proceedings of the 20th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP). IEEE, 536--543. Google ScholarDigital Library
M. Raab and A. Steger. 1998. Balls into BINSA simple and tight analysis. Random. Approx. Tech. Comput. Sci., 159--170. Google ScholarDigital Library
P. Sanders. 2001. Reconciling simplicity and realism in parallel disk models. In Proceedings of the 12th ACM-SIAM Symposium on Discrete Algorithms (SODA). SIAM, 67--76. Google ScholarDigital Library
C. Schindelhauer and G. Schomaker. 2005. Weighted distributed hash tables. In Proceedings of the 17th ACM Symposium on Parallel Algorithms and Architectures (SPAA). 218--227. Google ScholarDigital Library
F. L. Severence. 2009. System Modeling and Simulation: An Introduction. Wiley.Google Scholar
M. Stevens, A. Lenstra, and B. de Weger. 2007. Chosen-prefix collisions for md5 and colliding x.509 certificates for different identities. In Proceedings of the 26th Annual International Conference on the Theory and Applications of Cryptographic Techniques (EUROCRYPT). 1--22. Google ScholarDigital Library
M. Stevens, A. Sotirov, J. Appelbaum, A. Lenstra, D. Molnar, D. Osvik, and B. De Weger. 2009. Short Chosen-Prefix Collisions for MD5 and the Creation of a rogue CA certificate. In Proceedings of the 29th Annual International Cryptology Conference (CRYPTO). 55--69. Google ScholarDigital Library
I. Stoica, R. Morris, D. Liben-Nowell, D. Karger, M. Kaashoek, F. Dabek, and H. Balakrishnan. 2003. Chord: A scalable peer-to-peer lookup protocol for internet applications. IEEE/ACM Trans. Netw. 11, 1, 17--32. Google ScholarDigital Library
J. Viega. 2003. Practical random number generation in software. In Proceedings of the 19th Annual Computer Security Applications Conference (ACSAC). 129--141. Google ScholarDigital Library
J. Walker. 1998. ENT Test suite. http://www.fourmilab.ch/random.Google Scholar
T. Wang. 2007. Integer hash function. http://www.concentric.net/ttwang/tech/inthash.htm.Google Scholar
X. Wang, Y. Yin, and H. Yu. 2005. Finding Collisions in the Full SHA-1. In Proceedings of the 25th Annual International Cryptology Conference (CRYPTO). 17--36. Google ScholarDigital Library
S. A. Weil, S. A. Brandt, E. L. Miller, D. D. E. Long, and C. Maltzahn. 2006a. Ceph: A scalable, high-performance distributed file system. In Proceedings of the 7th Symposium on Operating Systems Design and Implementation (OSDI). 307--320. Google ScholarDigital Library
S. A. Weil, S. A. Brandt, E. L. Miller, and C. Maltzahn. 2006b. CRUSH: Controlled, scalable and decentralized placement of replicated data. In Proceedings of the ACM/IEEE Conference on Supercomputing. Google ScholarDigital Library
W. Zheng and G. Zhang. 2011. FastScale: Accelerate RAID Scaling by minimizing data migration. In Proceedings of the 9th USENIX Conference on File and Storage Technologies (FAST). Google ScholarDigital Library

Index Terms

Random Slicing: Efficient and Scalable Data Placement for Large-Scale Storage Systems
1. Software and its engineering
  1. Software organization and properties
    1. Contextual software domains
      1. Operating systems
        Memory management
        Allocation / deallocation strategies

Recommendations

Opportunities and challenges of storage grid enabled by grid service

The explosive growth of data generated worldwide by information digitization has been identified as the key driver of storage system evolution. In this paper, we discuss the existing storage system architectures and pose the challenges they are facing. ...
Read More
The Tiger Shark file system
COMPCON '96: Proceedings of the 41st IEEE International Computer Conference

Tiger Shark is a parallel file system for IBM's AIX operating system. It is designed to support interactive multimedia, particularly large-scale systems such as interactive television (ITV). Tiger Shark scales across the entire RS/6000 product line, ...
Read More
Efficient detection of large-scale redundancy in enterprise file systems

In order to catch and reduce waste in the exponentially increasing demand for disk storage, we have developed very efficient technology to detect approximate duplication of large directory hierarchies. Such duplication can be caused, for example, by ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM Transactions on Storage Volume 10, Issue 3
July 2014
113 pages
ISSN:1553-3077
EISSN:1553-3093
DOI:10.1145/2661087
Editor:
Darrell Long
University of California Santa Cruz, USA
Issue’s Table of Contents
Copyright © 2014 ACM
© 2014 Association for Computing Machinery. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of a national government. As such, the Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 7 August 2014
- Accepted: 1 September 2013
- Revised: 1 June 2013
- Received: 1 November 2012
Published in tos Volume 10, Issue 3

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
PRNG
randomized data distribution
scalability
storage management
Qualifiers
- research-article
- Research
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 9
  Total Citations
  View Citations
- 552
  Total Downloads
- Downloads (Last 12 months)15
- Downloads (Last 6 weeks)2
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Random Slicing: Efficient and Scalable Data Placement for Large-Scale Storage Systems

ACM Transactions on Storage

Abstract

References

Cited By

Index Terms

Recommendations

Opportunities and challenges of storage grid enabled by grid service

The Tiger Shark file system

Efficient detection of large-scale redundancy in enterprise file systems