Article

Analysis of a new intra-disk redundancy scheme for high-reliability RAID storage systems in the presence of unrecoverable errors

Authors:

Ajay Dholakia,

Evangelos Eleftheriou,

Xiao--Yu Hu,

Ilias Iliadis,

Jai Menon,

KK RaoAuthors Info & Claims

SIGMETRICS '06/Performance '06: Proceedings of the joint international conference on Measurement and modeling of computer systems

Pages 373 - 374

https://doi.org/10.1145/1140277.1140326

Published: 26 June 2006 Publication History

Get Access

Abstract

Today's data storage systems are increasingly adopting low-cost disk drives that have higher capacity but lower reliability, leading to more frequent rebuilds and to a higher risk of unrecoverable media errors. We propose a new XOR-based intra-disk redundancy scheme, called interleaved parity check (IPC), to enhance the reliability of RAID systems that incurs only negligible I/O performance degradation. The proposed scheme introduces an additional level of redundancy inside each disk, on top of the RAID redundancy across multiple disks. The RAID parity provides protection against disk failures, while the proposed scheme aims to protect against media-related unrecoverable errors.We develop a new model capturing the effect of correlated unrecoverable sector errors and subsequently use it to analyze the proposed scheme as well as the traditional redundancy schemes based on Reed-Solomon (RS) codes and single-parity-check (SPC) codes. We derive closed-form expressions for the mean time to data loss (MTTDL) of RAID 5 and RAID 6 systems in the presence of unrecoverable errors and disk failures. We then combine these results for a comprehensive characterization of the reliability of RAID systems that incorporate the proposed IPC redundancy scheme. Our results show that in the practical case of correlated errors, the proposed scheme provides the same reliability as the optimum albeit more complex RS coding scheme. Finally, the throughput performance of incorporating the intra-disk redundancy on various RAID systems is evaluated by means of event-driven simulations. A detailed description of these contributions is given in [1].

Reference

[1]

A. Dholakia et al. Analysis of a New Intra-Disk Redundancy Scheme for High-Reliability RAID Storage Systems in the Presence of Unrecoverable Errors. IBM Research Report RZ 3652, March 16, 2006.

Google Scholar

Cited By

View all

Schwarz TBreitgand DYadgar GPorter DEyal I(2018)Protecting Single Shingled Write Drives Against Latent Sector FailuresProceedings of the 11th ACM International Systems and Storage Conference10.1145/3211890.3211893(26-36)Online publication date: 4-Jun-2018
https://dl.acm.org/doi/10.1145/3211890.3211893
Iliadis IVenkatesan V(2015)Rebuttal to “Beyond MTTDL: A Closed-Form RAID-6 Reliability Equation”ACM Transactions on Storage10.1145/270031111:2(1-10)Online publication date: 20-Mar-2015
https://dl.acm.org/doi/10.1145/2700311
Wildani ASchwarz TMiller ELong D(2009)Protecting against rare event failures in archival systems2009 IEEE International Symposium on Modeling, Analysis & Simulation of Computer and Telecommunication Systems10.1109/MASCOT.2009.5366825(1-11)Online publication date: Sep-2009
https://doi.org/10.1109/MASCOT.2009.5366825
Show More Cited By

Index Terms

Analysis of a new intra-disk redundancy scheme for high-reliability RAID storage systems in the presence of unrecoverable errors
1. Hardware
  1. Communication hardware, interfaces and storage
    1. External storage
  2. Integrated circuits
    1. Semiconductor memory
      1. Non-volatile memory

Recommendations

Disk scrubbing versus intra-disk redundancy for high-reliability raid storage systems
SIGMETRICS '08: Proceedings of the 2008 ACM SIGMETRICS international conference on Measurement and modeling of computer systems

Two schemes proposed to cope with unrecoverable or latent media errors and enhance the reliability of RAID systems are examined. The first scheme is the established, widely used disk scrubbing scheme, which operates by periodically accessing disk drives ...
Analysis of a new intra-disk redundancy scheme for high-reliability RAID storage systems in the presence of unrecoverable errors
Performance evaluation review

Today's data storage systems are increasingly adopting low-cost disk drives that have higher capacity but lower reliability, leading to more frequent rebuilds and to a higher risk of unrecoverable media errors. We propose a new XOR-based intra-disk ...
A new intra-disk redundancy scheme for high-reliability RAID storage systems in the presence of unrecoverable errors

Today's data storage systems are increasingly adopting low-cost disk drives that have higher capacity but lower reliability, leading to more frequent rebuilds and to a higher risk of unrecoverable media errors. We propose an efficient intradisk ...

Comments

Information & Contributors

Information

Published In

SIGMETRICS '06/Performance '06: Proceedings of the joint international conference on Measurement and modeling of computer systems

June 2006

404 pages

ISBN:1595933190

DOI:10.1145/1140277

General Chair:
Raymond Marie
University of Rennes 1/IRISA, France
,
Program Chairs:
Peter Key
Microsoft Research, Cambridge, U.K.
,
Evgenia Smirni
College of William and Mary, USA

ACM SIGMETRICS Performance Evaluation Review Volume 34, Issue 1
Performance evaluation review
June 2006
388 pages
ISSN:0163-5999
DOI:10.1145/1140103
Issue’s Table of Contents

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 26 June 2006

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Article

Conference

SIGMETRICS06

Sponsor:

SIGMETRICS06: 2006 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems

June 26 - 30, 2006

Saint Malo, France

Acceptance Rates

Overall Acceptance Rate 459 of 2,691 submissions, 17%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

24
Total Citations
View Citations
657
Total Downloads

Downloads (Last 12 months)8
Downloads (Last 6 weeks)0

Reflects downloads up to 20 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

View all

Schwarz TBreitgand DYadgar GPorter DEyal I(2018)Protecting Single Shingled Write Drives Against Latent Sector FailuresProceedings of the 11th ACM International Systems and Storage Conference10.1145/3211890.3211893(26-36)Online publication date: 4-Jun-2018
https://dl.acm.org/doi/10.1145/3211890.3211893
Iliadis IVenkatesan V(2015)Rebuttal to “Beyond MTTDL: A Closed-Form RAID-6 Reliability Equation”ACM Transactions on Storage10.1145/270031111:2(1-10)Online publication date: 20-Mar-2015
https://dl.acm.org/doi/10.1145/2700311
Wildani ASchwarz TMiller ELong D(2009)Protecting against rare event failures in archival systems2009 IEEE International Symposium on Modeling, Analysis & Simulation of Computer and Telecommunication Systems10.1109/MASCOT.2009.5366825(1-11)Online publication date: Sep-2009
https://doi.org/10.1109/MASCOT.2009.5366825
Iliadis I(2009)Reliability modeling of RAID storage systems with latent errors2009 IEEE International Symposium on Modeling, Analysis & Simulation of Computer and Telecommunication Systems10.1109/MASCOT.2009.5366195(1-12)Online publication date: Sep-2009
https://doi.org/10.1109/MASCOT.2009.5366195
Storer MGreenan KMiller EVoruganti K(2008)PergamumProceedings of the 6th USENIX Conference on File and Storage Technologies10.5555/1364813.1364814(1-16)Online publication date: 26-Feb-2008
https://dl.acm.org/doi/10.5555/1364813.1364814
Ningfang Mi Riska ASmirni ERiedel E(2008)Enhancing data availability in disk drives through background activities2008 IEEE International Conference on Dependable Systems and Networks With FTCS and DCC (DSN)10.1109/DSN.2008.4630120(492-501)Online publication date: Jun-2008
https://doi.org/10.1109/DSN.2008.4630120
Schwarz TBreitgand DYadgar GPorter DEyal I(2018)Protecting Single Shingled Write Drives Against Latent Sector FailuresProceedings of the 11th ACM International Systems and Storage Conference10.1145/3211890.3211893(26-36)Online publication date: 4-Jun-2018
https://dl.acm.org/doi/10.1145/3211890.3211893
Yu ZWang ZHe HTian JLu XGuo B(2015)Discovering Information Propagation Patterns in Microblogging ServicesACM Transactions on Knowledge Discovery from Data10.1145/274280110:1(1-22)Online publication date: 22-Jul-2015
https://dl.acm.org/doi/10.1145/2742801
Iliadis IVenkatesan V(2015)Rebuttal to “Beyond MTTDL: A Closed-Form RAID-6 Reliability Equation”ACM Transactions on Storage10.1145/270031111:2(1-10)Online publication date: 20-Mar-2015
https://dl.acm.org/doi/10.1145/2700311
Johari SKumar AGupta PMalekian R(2015)dDRAID: A technique for capacity and performance enhancement of RAID storage systems2015 Annual IEEE India Conference (INDICON)10.1109/INDICON.2015.7443331(1-6)Online publication date: Dec-2015
https://doi.org/10.1109/INDICON.2015.7443331
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Abstract

Reference

Cited By

Index Terms

Recommendations

Disk scrubbing versus intra-disk redundancy for high-reliability raid storage systems