skip to main content
article

Intelligent storage: Cross-layer optimization for soft real-time workload

Published: 01 August 2006 Publication History

Abstract

In this work, we develop an intelligent storage system framework for soft real-time applications. Modern software systems consist of a collection of layers and information exchange across the layers is performed via well-defined interfaces. Due to the strictness and inflexibility of interface definition, it is not possible to pass the information specific to one layer to other layers. In practice, the exploitation of this information across the layers can greatly enhance the performance, reliability, and manageability of the system. We address the limitation of legacy interface definition via enabling intelligence in the storage system. The objective is to enable the lower-layer entity, for example, a physical or block device, to conjecture the semantic and contextual information of that application behavior which cannot be passed via the legacy interface. Based upon the knowledge obtained by the intelligence module, the system can perform a number of actions to improve the performance, reliability, security, and manageability of the system. Our intelligence storage system focuses on optimizing the I/O subsystem performance for a soft real-time application. Our intelligence framework consists of three components: the workload monitor, workload analyzer, and system optimizer. The workload monitor maintains a window of recent I/O requests and extracts feature vectors in regular intervals. The workload analyzer is trained to determine the class of the incoming workload by using the feature vector. The system optimizer performs various actions to tune the storage system for a given workload. We use confidence rate boosting to train the workload analyzer. This sophisticated learner achieves a higher than 97% accuracy of workload class prediction. We develop a prototype intelligence storage system on the legacy operating system platform. The system optimizer performs; (1) dynamic adjustment of the file-system-level read-ahead size; (2) dynamic adjustment of I/O request size; and (3) filtering of I/O requests. We examine the effect of this autonomic optimization via experimentation. We find that the storage level pro-active optimization greatly enhances the efficiency of the underlying storage system. The sophisticated intelligence module developed in this work does not restrict its usage for performance optimization. It can be effectively used as classification engine for generic autonomic computing environment, i.e. management, diagnosis, security and etc.

References

[1]
Aboutabl, M., Agrawala, A., and Decotignie, J.-D. 1998. Temporally determinate disk access: An experimental approach. In Proceedings of the ACM SIGMETRICS Joint International Conference on Measurement and Modeling of Computer Systems. ACM, New York, 280--281.
[2]
Acharya, A., Uysal, M., and Saltz, J. 1998. Active disks: Programming model, algorithms and evaluation. In ASPLOS-VIII: Proceedings of the 8th International Conference on Architectural Support for Programming Languages and Operating Systems. ACM, New York, 81--91.
[3]
ANSI. 2002. At attachment with packet interface entension-(ata/atapi-6). American National Standard for Information Technology, T13-1410D.
[4]
Bovet, D. P. and Cesati, M. 2005. Understanding the LINUX Kernel. O'REILLY.
[5]
Breiman, L., Friedman, J., Olshen, R., and Stone., C. 1984. Classification and Regression Trees. Wadsworth, Belmont, CA.
[6]
Burnett, N. C., Bent, J., Arpaci-Dusseau, A. C., and Arpaci-Dusseau, R. H. 2000. Exploiting gray-box knowledge of buffer-cache management. In Proceedings of 2002 USENIX Annual Technical Conference. USENIX Association, Berkeley, CA, 29--44.
[7]
Choi, J. and Won, Y. 2002. Power constraints: Another dimension of complexity in continuous media playback. In Proceedings of the Joint International Workshops on Interactive Distributed Multimedia Systems and Protocols for Multimedia Systems. Coimbra, Portugal, 288--299.
[8]
Cohen, I., Goldszmidt, M., Kelly, T., Symons, J., and Chase, J. S. 2004. Correlating instrumentation data to system states: A building block for automated diagnosis and control. Tech. Rep. HPL-2004-183, HP Laboratories, Palo Alto, CA, Oct.
[9]
David, R. R. 2004. Diskbench: User-Level disk feature extraction tool. Tech. rep. UCSB TR-2004-18. Nov.
[10]
Dimitrijevic, Z., Rangaswami, R., and Chang, E. 2003. Design and implementation of semi-preemptible IO. In FAST '03: Proceedings of the Conference on File and Storage Technologies. San Jose, CA. 145--158.
[11]
Freud, Y. and Schapire, R. E. 1995. A decision-theoretic generalization of on-line learning and an application to boosting. In EuroCOLT '95: Proceedings of the 2nd European Conference on Computational Learning Theory. Springer Verlag, London, 23--37.
[12]
Friedman, J. 2001. Greedy function approximation: A gradient boosting machine. Ann. Statist. 29, 1189--1232.
[13]
Ganger, G. 2001. Blurring the line between OSES and storage devices. Tech. rep. Technical Report CMU-CS-01-166, Carnegie Mellon University. Dec.
[14]
Ganger, G. R., Worthington, B. L., and Patt, Y. 1998. The Disksim simulation environment. Tech. rep. CSE-TR-358-98, Dept. of Electrical Engineering and Computer Science, Univ. of Michigan. Feb.
[15]
Hughes, G. 2002. Wise drives. IEEE Spectrum 39, 8 (Aug.), 37--41.
[16]
Huston, L., Sukthankar, R., Wickremesinghe, R., Satyanarayanan, M., Ganger, G., Riedel, E., and Ailamaki, A. 2004. Diamond: A storage architecture for early discard in interactive search. In FAST '04: Proceedings of the 3rd USENIX Conference on File and Techonologies. San Jose, CA.
[17]
Iyer, S. and Druschel, P. 2001. Anticipatory scheduling: A disk scheduling framework to overcome deceptive idleness in synchronous I/O. In SOSP '01: Proceedings of the 18th ACM Symposium on Operating Systems Principles. ACM, New York, 117--130.
[18]
Karlsson, M. and Covell, M. 2005. Dynamic black-box performance model estimation for self-tuning regulators. In Proceedings of Internation Conference on Autonomic Computing. Seattle, WA, 172--182.
[19]
Kim, T., Won, Y., and Koh, K. 2005. Apollon: File system support for qos augmented I/O. In Proceedings of the Pacific Rim Conference on Multimedia. Jeju, Korea.
[20]
Li, Z., Chen, Z., Srinivasan, S. M., and Zhou, Y. 2004. C-Miner: Mining block correlations in storage. In FAST '04: Proceedings of the 3rd USENIX Conference on File and Storage Technologies. San Francisco, CA, 173--186.
[21]
Lu, Y., Du, D. H., and Ruwart, T. 2005. Qos provisioning framework for an OSD-Based storage system. In Proceedings of the 22nd IEEE/13th NASA Goddard Conferene on Mass Storage Systems and Technologies (MSST). 28--35.
[22]
Lumb, C. R., Schindler, J., and Ganger, G. R. 2002. Freeblock scheduling outside of disk firmware. In FAST '02: Proceedings of the Conference on File and Storage Technologies. USENIX Association, Berkeley, CA, 275--288.
[23]
Mesnier, M., Thereska, E., Gregory Ganger, D. E., and Seltzer, M. 2004. File classification in self-*stroage systems. In Proceedings of the 1st International Conference on Autonomic Computing.
[24]
Mitechelle, T. M. 1997. Machine Learning. Donnelly and Sons.
[25]
mpeg2dec. http://libmpeg2.sourceforge.net.
[26]
mplayer. http://www.mplayerhq.hu.
[27]
Niranjan, T., Chiueh, T., and Schloss, G. A. 1997. Implementation and evaluation of a multimedia file system. In ICMCS '97: Proceedings of the International Conference on Multimedia Computing and Systems (ICMCS '97). IEEE Computer Society, Ottawa, Ontario, Canada, 269--276.
[28]
Performance Evaluation Laboratory, B. Y. U. 2006. Dtb: Linux disk trace buffer. http://traces.byu.edu/new/Tools/.
[29]
Quinlan, J. R. 1993. C4.5: Programs for Machine Learning. Morgan Kaufmann, San Francisco, CA.
[30]
Riedel, E., Faloutsos, C., Ganger, G. R., and Nagle, D. F. 2000. Data mining on an oltp system (nearly) for free. In SIGMOD '00: Proceedings of the ACM SIGMOD International Conference on Management of Data. ACM, New York, 13--21.
[31]
Riedel, E., Gibson, G. A., and Faloutsos, C. 1998. Active storage for large-scale data mining and multimedia. In VLDB '98: Proceedings of the 24th International Conference on Very Large Data Bases. Morgan Kaufmann, San Francisco, CA, 62--73.
[32]
Schapire, R. E. and Singer, Y. 1999. Improved boosting algorithms using confidence-rated predictions. Mach. Learn. 37, 3 (Dec.), 297--336.
[33]
Schindler, J., Griffin, J. L., Lumb, C. R., and Ganger, G. R. 2002. Track-Aligned extents: Matching access patterns to disk drive characteristics. In FAST '02: Proceedings of the Conference on File and Storage Technologies. USENIX Association, Berkeley, CA, 259--274.
[34]
Sivathanu, M., Prabhakaran, V., Popovici, F. I., Denehy, T. E., Arpaci-Dusseau, A. C., and Arpaci-Dusseau, R. H. 2003. Semantically-Smart disk systems. In FAST '03: Proceedings of 2nd USENIX Conference on File and Storage Technologies (FAST). USENIX Association.
[35]
Wang, C., Goebel, V., and Plagemann, T. 1999. Techniques to increase disk access locality in the minorca multimedia file system. In Proceedings of the 7th ACM Multimedia Conference. 147--150.
[36]
Wang, R. Y., Anderson, T. E., and Patterson, D. A. 1999. Virtual log based file systems for a programmable disk. In OSDI '99: Proceedings of the 3rd Symposium on Operating Systems Design and Implementation. USENIX Association, Berkeley, CA, 29--43.
[37]
Weissel, A., Beutel, B., and Bellosa, F. 2002. Cooperative I/O: A novel I/O semantics for energy-aware applications. SIGOPS Oper. Syst. Rev. 36, SI (Dec.), 117--129.
[38]
Wildstrom, J., Stone, P., Witchel, E., Mooney, R., and Dahlin, M. 2005. Towards self-configuring hardware for distributed computer systems. In Proceedings of the International Conference on Autonomic Computing. Seattle, WA, 241--249.
[39]
Won, Y., Park, J., Kim, D., and Lee, S. 2005. Hermes: Embedded file system for a/v workload. Multimedia Tools and Applications, Springer.
[40]
Worthington, B. L., Ganger, G. R., Patt, Y. N., and Wilkes, J. 1995. On-line extraction of SCSI disk drive parameters. In SIGMETRICS '95/PERFORMANCE '95: Proceedings of the ACM SIGMETRICS Joint International Conference on Measurement and Modeling of Computer Systems. ACM, New York, 146--156.
[41]
xine. http://xinehq.de.
[42]
Xu, W., Bodik, P., and Patterson, D. 2004. A flexible architecture for statistical learning and data mining from system log streams. In Proceedings of the Workshop on Temporal Data Mining: Algorithms, Theory and Applications Conjunction with the International Conference on Data Mining. Brighton, UK.
[43]
Zhang, Z., Lian, Q., lin, S., Chen, W., Chen, Y., and Jin, C. 2005. Bitvault: A highly reliable distributed retension platform. Tech. rep. MSR-TR-2005-179, Microsoft Research, China. Dec.
[44]
Zhang, Z., Lin, S., Lian, Q., and Jin, C. 2004. Repstore: A self-managing and self-tuning storage backend with smart bricks. In Proceedings of the International Conference on Autonomic Computing. 122--129.

Cited By

View all
  • (2024)DyStore: Dynamic Item Location Encoding and Navigation for Smart Locker Systems2024 IEEE Global Conference on Artificial Intelligence and Internet of Things (GCAIoT)10.1109/GCAIoT63427.2024.10833591(1-6)Online publication date: 19-Nov-2024
  • (2024)Flexit: Flexible Location Encoding for Item Access in Medication Management SystemsIEEE Access10.1109/ACCESS.2024.348142612(159966-159981)Online publication date: 2024
  • (2022)IOSIG: Declarative I/O-Stream Properties Using PragmasDatenbank-Spektrum10.1007/s13222-022-00419-w22:2(109-119)Online publication date: 20-Jun-2022
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Storage
ACM Transactions on Storage  Volume 2, Issue 3
August 2006
149 pages
ISSN:1553-3077
EISSN:1553-3093
DOI:10.1145/1168910
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 August 2006
Published in TOS Volume 2, Issue 3

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Intelligence
  2. autonomic computing
  3. boosting
  4. cross layer optimization
  5. file system
  6. machine learning
  7. multimedia
  8. storage

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)12
  • Downloads (Last 6 weeks)2
Reflects downloads up to 07 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2024)DyStore: Dynamic Item Location Encoding and Navigation for Smart Locker Systems2024 IEEE Global Conference on Artificial Intelligence and Internet of Things (GCAIoT)10.1109/GCAIoT63427.2024.10833591(1-6)Online publication date: 19-Nov-2024
  • (2024)Flexit: Flexible Location Encoding for Item Access in Medication Management SystemsIEEE Access10.1109/ACCESS.2024.348142612(159966-159981)Online publication date: 2024
  • (2022)IOSIG: Declarative I/O-Stream Properties Using PragmasDatenbank-Spektrum10.1007/s13222-022-00419-w22:2(109-119)Online publication date: 20-Jun-2022
  • (2014)IO Workload Characterization RevisitedIEEE Transactions on Computers10.1109/TC.2013.18763:12(3026-3038)Online publication date: 1-Dec-2014
  • (2012)Storage QoS provisioning for execution programming of data-intensive applicationsScientific Programming10.1155/2012/68421720:1(69-80)Online publication date: 1-Jan-2012
  • (2011)Request Bridging and InterleavingACM Transactions on Storage10.1145/1970348.19703497:2(1-31)Online publication date: 1-Jul-2011
  • (2011)Relieving the burden of track switch in modern hard disk drivesMultimedia Systems10.1007/s00530-010-0218-517:3(219-235)Online publication date: 1-Jun-2011
  • (2010)Extract and infer quicklyACM Transactions on Storage10.1145/1807060.18070636:2(1-26)Online publication date: 30-Jul-2010
  • (2010)NCQ vs. I/O schedulerACM Transactions on Storage10.1145/1714454.17144566:1(1-37)Online publication date: 5-Apr-2010
  • (2009)Exploiting idle CPU cores to improve file access performanceProceedings of the 3rd International Conference on Ubiquitous Information Management and Communication10.1145/1516241.1516334(529-535)Online publication date: 15-Feb-2009
  • Show More Cited By

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media