ACM Home Page
Please provide us with feedback. Feedback
Using device diversity to protect data against batch-correlated disk failures
Full text PdfPdf (306 KB)
Source Conference on Computer and Communications Security archive
Proceedings of the second ACM workshop on Storage security and survivability table of contents
Alexandria, Virginia, USA
SESSION: Protection and trust table of contents
Pages: 47 - 52  
Year of Publication: 2006
ISBN:1-59593-552-5
Authors
Jehan-François Pâris  University of Houston
Darrell D. E. Long  University of California
Sponsors
SIGSAC: ACM Special Interest Group on Security, Audit, and Control
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 2,   Downloads (12 Months): 25,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
Save this Article to a Binder    Display Formats: BibTex  EndNote ACM Ref   
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1179559.1179568
What is a DOI?

ABSTRACT

Batch-correlated failures result from the manifestation of a common defect in most, if not all, disk drives belonging to the same production batch. They are much less frequent than random disk failures but can cause catastrophic data losses even in systems that rely on mirroring or erasure codes to protect their data. We propose to reduce impact of batch-correlated failures on disk arrays by storing redundant copies of the same data on disks from different batches and, possibly, different manufacturers. The technique is especially attractive for mirrored organizations as it only requires that the two disks that hold copies of the same data never belong to the same production batch. We also show that even partial diversity can greatly increase the probability that the data stored in a RAID array will survive batch-correlated failures.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
M. Baker, M. Shah, D.S.H. Rosenthal, M. Roussopoulos, P. Maniatis, T.J. Giuli, and P. Bungale. A Fresh Look at the Reliability of Long-Term Storage. In Proc. First EuroSys Conference (EuroSys 2006), Leuven, Belgium, Apr. 2006.
 
2
W. Burkhard and J. Menon. Disk Array Storage System Reliability. In Proceedings of the 23rd Annual International Symposium on Fault-Tolerant Computing (FTCS-23), Toulouse, France, June 1993, 432--441.
3
 
4
 
5
J.G. Elerath. Specifying Reliability in the Disk Drive Industry: No More MTBF's. In Proceedings of the 46th Annual Reliability and Maintainability Symposium (RAMS 2000), Jan. 2000, 194--199.
 
6
J.G. Elerath and S. Shah. Server Class Disk Drives: How Reliable Are They? In Proceedings of the 50th Annual Reliability & Maintainability Symposium (RAMS 2004), Jan. 2004, 151--156.
7
 
8
T.J.E. Schwarz, S.J. and W.A. Burkhard. RAID Organization and Performance. In Proceedings of the 12th International Conference on Distributed Computing Systems, Yokohama, Japan, June 1992, 318--325.
 
9
 
10
S. Shah and J.G. Elerath. Disk Drive Vintage and Its Effect on Reliability. In Proceedings of the 50th Annual Reliability & Maintainability Symposium (RAMS 2004), Jan. 2004, 163--165.
 
11
S. Shah and J.G. Elerath. Reliability Analysis of Disk Drive Failure Mechanisms. In Proceedings of the 51st Annual Reliability & Maintainability Symposium (RAMS 2005), Jan. 2005, 226--231.
 
12

Collaborative Colleagues:
Jehan-François Pâris: colleagues
Darrell D. E. Long: colleagues