skip to main content
article

Modeling the relative fitness of storage

Published: 12 June 2007 Publication History

Abstract

Relative fitness is a new black-box approach to modeling the performance of storage devices. In contrast with an absolute model that predicts the performance of a workload on a given storage device, a relative fitness model predicts performance differences between a pair of devices. There are two primary advantages to this approach. First, because are lative fitness model is constructed for a device pair, the application-device feedback of a closed workload can be captured (e.g., how the I/O arrival rate changes as the workload moves from device A to device B). Second, a relative fitness model allows performance and resource utilization to be used in place of workload characteristics. This is beneficial when workload characteristics are difficult to obtain or concisely express (e.g., rather than describe the spatio-temporal characteristics of a workload, one could use the observed cache behavior of device A to help predict the performance of B.
This paper describes the steps necessary to build a relative fitness model, with an approach that is general enough to be used with any black-box modeling technique. We compare relative fitness models and absolute models across a variety of workloads and storage devices. On average, relative fitness models predict bandwidth and throughput within 10-20% and can reduce prediction error by as much as a factor of two when compared to absolute models.

References

[1]
G. A. Alvarez, J. Wilkes, E. Borowsky, S. Go, T. H. Romer, R. Becker-Szendy, R. Golding, A. Merchant, M. Spasojevic, and A. Veitch. Minerva: an automated resource provisioning tool for large-scale storage systems. ACM Transactions on Computer Systems, 19(4):483--518. ACM, November 2001.
[2]
E. Anderson. Simple table-based modeling of storage devices. SSP Technical Report HPL-SSP-2001-4. HP Laboratories, July 2001.
[3]
E. Anderson, M. Hobbs, K. Keeton, S. Spence, M. Uysal, and A. Veitch. Hippodrome: running circles around storage administration. Conference on File and Storage Technologies (Monterey, CA, 28-30 January 2002), pages 175--188. USENIX Association, 2002.
[4]
N. Appliance. PostMark: A New File System Benchmark. http://www.netapp.com.
[5]
E. Borowsky, R. Golding, A. Merchant, L. Schreier, E. Shriver, M. Spasojevic, and J. Wilkes. Using attribute-managed storage to achieve QoS. International Workshop on Quality of Service (Pittsburgh, PA, 21-23 March 1997). IFIP, 1997.
[6]
L. Breiman, J. H.Friedman, R. A. Olshen, and C. J. Stone. Classification and Regression Trees. Wadsworth.
[7]
M. J. Carey, D. J. DeWitt, M. J. Franklin, N. E. Hall, M. L. McAuliffe, J. F. Naughton, D. T. Schuh, M. H. Solomon, C. K. Tan, O. G. Tsatalos, S. J. White, and M. J. Zwilling. Shoring up persistent applications. ACM SIGMOD International Conference on Management of Data (Minneapolis, MN, 24-27 May 1994). Published as SIGMOD Record, 23(2):383--394. ACM Press, 1994.
[8]
D. J. Futuyma. Evolutionary Biology. Third edition. SUNY, Stony Brook. Sinauer. December 1998.
[9]
G. R. Ganger. Generating representative synthetic workloads: an unsolved problem. International Conference on Management and Performance Evaluation of Computer Systems (Nashville, TN), pages 1263--1269, 1995.
[10]
G. R. Ganger and Y. N. Patt. Using system-level models to evaluate I/O subsystem designs. IEEE Transactions on Computers, 47(6):667--678, June 1998.
[11]
G. R. Ganger, J. D. Strunk, and A. J. Klosterman. Self-Storage: brick-based storage with automated administration. Technical Report CMU-CS-03-178. Carnegie Mellon University, August 2003.
[12]
T. Hastie, R. Tibshirani, and J. Friedman. The Elements of Statistical Learning. Springer Verlag. 2001.
[13]
Intel. iSCSI. www.sourceforge.net/projects/intel-iscsi.
[14]
T. Kelly, I. Cohen, M. Goldszmidt, and K. Keeton. Inducing models of black-box storage arrays. Technical report HPL-2004-108. HP, June 2004.
[15]
Z. Kurmas and K. Keeton. Using the distiller to direct the development of self-configuration software. International Conference on Autonomic Computing (New York, NY, 17-18 May 2004), pages 172--179. IEEE, 2004.
[16]
Z. Kurmas, K. Keeton, and K. Mackenzie. Synthesizing representative I/O workloads using iterative distillation. International Workshop on Modeling, Analysis, and Simulation of Computer and Telecommunications Systems (Orlando, FL, 12-15 October 2003). IEEE/ACM, 2003.
[17]
A. Merchant and P. S. Yu. Analytic modeling of clustered RAID with mapping based on nearly random permutation. IEEE Transactions on Computers, 45(3):367--373, March 1996.
[18]
T. M. Mitchell. Machine Learning. McGraw-Hill, 1997.
[19]
F. I. Popovici, A. C. A. Dusseau, and R. H. A. Dusseau. Robust, portable I/O scheduling with the disk mimic. USENIX Annual Technical Conference (San Antonio, TX, 09-14 June 2003), pages 297--310. IEEE, 2003.
[20]
C. Ruemmler and J. Wilkes. An introduction to disk drive modeling. IEEE Computer, 27(3):17--28, March 1994.
[21]
J. Satran. iSCSI. http://www.ietf.org/rfc/rfc3720.txt.
[22]
E. Shriver, A. Merchant, and J. Wilkes. An analytic behavior model for disk drives with readahead caches and request reordering. ACM SIGMETRICS Conference on Measurement and Modeling of Computer Systems (Madison, WI, 22-26 June 1999). Published as ACM SIGMETRICS Performance Evaluation Review, 26(1):182--191. ACM Press, 1990.
[23]
Transaction Processing Performance Council. TPC Benchmark C. http://www.tpc.org/tpcc.
[24]
M. Uysal, G. A. Alvarez, and A. Merchant. A modular, analytical throughput model for modern disk arrays. International Workshop on Modeling, Analysis, and Simulation of Computer and Telecommunications Systems (Cincinnati, OH, 15-18 August 2001), pages 183--192. IEEE, 2001.
[25]
E. Varki, A. Merchant, J. Xu, and X. Qiu. Issues and challenges in the performance analysis of real disk arrays. Transactions on Parallel and Distributed Systems, 15(6):559--574. IEEE, June 2004.
[26]
M. Wang, A. Ailamaki, and C. Faloutsos. Capturing the spatio-temporal behavior of real traffic data. IFIP WG 7.3 Symposium on Computer Performance (Rome, Italy, 23-27 September 2002). Published as Performance Evaluation, 49(1-4):147--163, 2002.
[27]
M. Wang, K. Au, A. Ailamaki, A. Brockwell, C. Faloutsos, and G. R. Ganger. Storage device performance prediction with CART models. International Workshop on Modeling, Analysis, and Simulation of Computer and Telecommunications Systems (Volendam, The Netherlands, 05-07 October 2004), pages 588--595. IEEE/ACM, 2004.
[28]
M. Wang, T. Madhyastha, N. H. Chan, S. Papadimitriou, and C. Faloutsos. Data mining meets performance evaluation: fast algorithms for modeling bursty traffic. International Conference on Data Engineering (San Jose, CA, 26-01 March 2002), pages 507--516. IEEE, 2002.

Cited By

View all
  • (2017)Monitoring Performance in Large Scale Computing Clouds with Passive Benchmarking2017 IEEE 10th International Conference on Cloud Computing (CLOUD)10.1109/CLOUD.2017.32(188-195)Online publication date: Jun-2017
  • (2015)HVF: An optimized data backup and recovery system for hard disk based consumer electronicsOptik10.1016/j.ijleo.2014.08.171126:2(251-257)Online publication date: Jan-2015
  • (2015)Toward Scheduling I/O Request of Mapreduce Tasks Based on Markov ModelSelected Papers of the First International Conference on Mobile, Secure, and Programmable Networking - Volume 939510.1007/978-3-319-25744-0_7(78-89)Online publication date: 15-Jun-2015
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM SIGMETRICS Performance Evaluation Review
ACM SIGMETRICS Performance Evaluation Review  Volume 35, Issue 1
SIGMETRICS '07 Conference Proceedings
June 2007
382 pages
ISSN:0163-5999
DOI:10.1145/1269899
Issue’s Table of Contents
  • cover image ACM Conferences
    SIGMETRICS '07: Proceedings of the 2007 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
    June 2007
    398 pages
    ISBN:9781595936394
    DOI:10.1145/1254882
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 June 2007
Published in SIGMETRICS Volume 35, Issue 1

Check for updates

Author Tags

  1. CART
  2. black-box
  3. modeling
  4. storage

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)9
  • Downloads (Last 6 weeks)0
Reflects downloads up to 27 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2017)Monitoring Performance in Large Scale Computing Clouds with Passive Benchmarking2017 IEEE 10th International Conference on Cloud Computing (CLOUD)10.1109/CLOUD.2017.32(188-195)Online publication date: Jun-2017
  • (2015)HVF: An optimized data backup and recovery system for hard disk based consumer electronicsOptik10.1016/j.ijleo.2014.08.171126:2(251-257)Online publication date: Jan-2015
  • (2015)Toward Scheduling I/O Request of Mapreduce Tasks Based on Markov ModelSelected Papers of the First International Conference on Mobile, Secure, and Programmable Networking - Volume 939510.1007/978-3-319-25744-0_7(78-89)Online publication date: 15-Jun-2015
  • (2014)Queueing-based storage performance modeling and placement in OpenStack environments2014 21st International Conference on High Performance Computing (HiPC)10.1109/HiPC.2014.7116887(1-10)Online publication date: Dec-2014
  • (2014)Parameterizable benchmarking framework for designing a MapReduce performance modelConcurrency and Computation: Practice & Experience10.1002/cpe.322926:12(2005-2026)Online publication date: 25-Aug-2014
  • (2013)A novel black-box simulation model methodology for predicting performance and energy consumption in commodity storage devicesSimulation Modelling Practice and Theory10.1016/j.simpat.2013.01.00634(48-63)Online publication date: May-2013
  • (2011)PestoProceedings of the 2nd ACM Symposium on Cloud Computing10.1145/2038916.2038935(1-14)Online publication date: 26-Oct-2011
  • (2011)No one (cluster) size fits allProceedings of the 2nd ACM Symposium on Cloud Computing10.1145/2038916.2038934(1-14)Online publication date: 26-Oct-2011
  • (2011)Performance modeling and analysis of flash-based storage devicesProceedings of the 2011 IEEE 27th Symposium on Mass Storage Systems and Technologies10.1109/MSST.2011.5937213(1-11)Online publication date: 23-May-2011
  • (2011)Toward Automating Work Consolidation with Performance Guarantees in Storage ClustersProceedings of the 2011 IEEE 19th Annual International Symposium on Modelling, Analysis, and Simulation of Computer and Telecommunication Systems10.1109/MASCOTS.2011.32(326-335)Online publication date: 25-Jul-2011
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media