skip to main content
research-article

Rethink the sync

Published: 22 September 2008 Publication History

Abstract

We introduce external synchrony, a new model for local file I/O that provides the reliability and simplicity of synchronous I/O, yet also closely approximates the performance of asynchronous I/O. An external observer cannot distinguish the output of a computer with an externally synchronous file system from the output of a computer with a synchronous file system. No application modification is required to use an externally synchronous file system. In fact, application developers can program to the simpler synchronous I/O abstraction and still receive excellent performance. We have implemented an externally synchronous file system for Linux, called xsyncfs. Xsyncfs provides the same durability and ordering-guarantees as those provided by a synchronously mounted ext3 file system. Yet even for I/O-intensive benchmarks, xsyncfs performance is within 7% of ext3 mounted asynchronously. Compared to ext3 mounted synchronously, xsyncfs is up to two orders of magnitude faster.

References

[1]
Best, S. 2000. JFS overview. Tech. Rep., IBM, http://www-128.ibm.com/developerworks/linux/library/l-jfs.html.
[2]
Chen, P. M., Ng, W. T., Chandra, S., Aycock, C., Rajamani, G., and Lowell, D. 1996. The Rio file cache: Surviving operating system crashes. In Proceedings of the 7th International Conference on Architectural Support for Programming Languages and Operating Systems. Cambridge, MA, 74--83.
[3]
Elnozahy, E. N., Alvisi, L., Wang, Y.-M., and Johnson, D. B. 2002. A survey of rollback-recovery protocols in message-passing systems. ACM Comput. Surv. 34, 3, 375--408.
[4]
Elnozahy, E. N. and Zwaenepoel, W. 1992. Manetho: transparent rollback-recovery with low overhead, limited rollback, and fast output commit. IEEE Trans. Comput. 41, 5, 526--531.
[5]
Flautner, K. and Mudge, T. 2002. Vertigo: automatic performance-setting for Linux. In Proceedings of the 5th Symposium on Operating Systems Design and Implementation. Boston, MA, 105--116.
[6]
Hagmann, R. 1987. Reimplementing the Cedar file system using logging and group commit. In Proceedings of the 11th ACM Symposium on Operating Systems Principles. Austin, TX, 155--162.
[7]
Hill, M. D., Larus, J. R., Reinhardt, S. K., and Wood, D. A. 1993. Cooperative shared memory: software and hardware for scalable multiprocessors. ACM Trans. Comput. Syst. 11, 4, 300--318.
[8]
Hitz, D., Lau, J., and Malcolm, M. 1994. File system design for an NFS file server appliance. In Proceedings of the Winter USENIX Technical Conference.
[9]
Katcher, J. 1997. PostMark: A new file system benchmark. Tech. rep. TR3022, Network Appliance.
[10]
Lamport, L. 1978. Time, clocks, and the ordering of events in a distributed system. ACM Commun. 21, 7, 558--565.
[11]
Liskov, B. and Rodrigues, R. 2004. Transactional file systems can be fast. In Proceedings of the 11th SIGOPS European Workshop. Leuven, Belgium.
[12]
Lowell, D. E., Chandra, S., and Chen, P. M. 2000. Exploring failure transparency and the limits of generic recovery. In Proceedings of the 4th Symposium on Operating Systems Design and Implementation. San Diego, CA.
[13]
Lowell, D. E. and Chen, P. M. 1998. Persistent messages in local transactions. In Proceedings of the 1998 Symposium on Principles of Distributed Computing. 219--226.
[14]
McKusick, M. K. 2006. Disks from the perspective of a file system. ;login: 31, 3, 18--19.
[15]
McKusick, M. K., Joy, W. N., Leffler, S. J., and Fabry, R. S. 1984. A fast file system for unix. ACM Trans. Comput. Syst. 2, 3, 181--197.
[16]
MySQL AB. 2006. MySQL Reference Manual. MySQL AB. http://dev.mysql.com/.
[17]
Namesys. 2006. ReiserFS. Namesys, http://www.namesys.com/.
[18]
Nightingale, E. B., Chen, P. M., and Flinn, J. 2006. Speculative execution in a distributed file system. ACM Trans. Comput. Syst. 24, 4, 361--392.
[19]
OSDL 2006. OSDL Database test 2. OSDL, http://www.osdl.org/.
[20]
Paxton, W. H. 1979. A client-based transaction system to maintain data integrity. In Proceedings of the 7th ACM Symposium on Operating Systems Principles. 18--23.
[21]
Prabhakaran, V., Bairavasundaram, L. N., Agrawal, N., Gunawi, H. S., Arpaci-Dusseau, A. C., and Arpaci-Dusseau, R. H. 2005. IRON file systems. In Proceedings of the 20th ACM Symposium on Operating Systems Principles. Brighton, UK, 206--220.
[22]
Qin, F., Tucek, J., Sundaresan, J., and Zhou, Y. 2005. Rx: treating bugs as allergies—a safe method to survive software failures. In Proceedings of the 20th ACM Symposium on Operating Systems Principles. Brighton, UK, 235--248.
[23]
Ritchie, D. M. and Thompson, K. 1974. The UNIX time-sharing system. ACM Commun. 17, 7, 365--375.
[24]
Scales, D. J., Gharachorloo, K., and Thekkath, C. A. 1996. Shasta: a low overhead, software-only approach for supporting fine-grain shared memory. In Proceedings of the 7th Symposium on Architectural Support for Programming Languages and Operating Systems (ASPLOSVII). 174--185.
[25]
Schmuck, F. and Wylie, J. 1991. Experience with transactions in QuickSilver. In Proceedings of the 13th ACM Symposium on Operating Systems Principles. 239--253.
[26]
Seltzer, M. I., Ganger, G. R., McKusick, M. K., Smith, K. A., Soules, C. A. N., and Stein, C. A. 2000. Journaling versus soft updates: asynchronous meta-data protection in file systems. In Proceedings of the USENIX Annual Technical Conference. San Diego, CA, 18--23.
[27]
Silberschatz, A. and Galvin, P. B. 1998. Operating System Concepts, 5th ed. Addison Wesley. 27.
[28]
Slashdot. 2005. Your hard drive lies to you. Slashdot. http://hardware.slashdot.org/article.pl?sid=05/05/13/0529252.
[29]
Spector, A. Z., Daniels, D., Duchamp, D., Eppinger, J. L., and Pausch, R. 1985. Distributed transactions for reliable systems. In Proceedings of the 10th ACM Symposium on Operating Systems Principles. Orcas Island, WA, 127--146.
[30]
Standard Performance Evaluation Corporation. 2006. SPECweb99. Standard Performance Evaluation Corporation, http://www.spec.org/web99.
[31]
Strom, R. E. and Yemini, S. 1985. Optimistic recovery in distributed systems. ACM Trans. Comput. Syst. 3, 3, 204--226.
[32]
Wang, A.-I. A., Reiher, P., Popek, G. J., and Kuenning, G. H. 2002. Conquest: better performance through a disk/persistent-RAM hybrid file system. In Proceedings of the USENIX Annual Technical Conference. Monterey, CA.
[33]
Weinstein, M. J., Thomas W. Page, J., Livezey, B. K., and Popek, G. J. 1985. Transactions and synchronization in a distributed operating system. In Proceedings of the 10th ACM Symposium on Operating Systems Principles. Oreas Island, WA, 115--126.
[34]
Wu, M. and Zwaenepoel, W. 1994. eNVy: a non-volatile, main memory storage system. In Proceedings of the 6th International Conference on Architectural Support for Programming Languages and Operating Systems. San Jose, CA, 86--97.

Cited By

View all
  • (2024)MemSnap μCheckpoints: A Data Single Level Store for Fearless PersistenceProceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 310.1145/3620666.3651334(622-638)Online publication date: 27-Apr-2024
  • (2024)DeLiBA-K: Speeding-up Hardware-Accelerated Distributed Storage Access by Tighter Linux Kernel Integration and Use of a Modern APIProceedings of the SC '24 Workshops of the International Conference on High Performance Computing, Network, Storage, and Analysis10.1109/SCW63240.2024.00075(531-544)Online publication date: 17-Nov-2024
  • (2023)KalpaVriksh: Efficient and Cost-effective GUI Application Hosting using Singleton Snapshots2023 IEEE/ACM 23rd International Symposium on Cluster, Cloud and Internet Computing (CCGrid)10.1109/CCGrid57682.2023.00026(180-190)Online publication date: May-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Computer Systems
ACM Transactions on Computer Systems  Volume 26, Issue 3
September 2008
108 pages
ISSN:0734-2071
EISSN:1557-7333
DOI:10.1145/1394441
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 22 September 2008
Accepted: 01 June 2008
Revised: 01 June 2008
Received: 01 September 2007
Published in TOCS Volume 26, Issue 3

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. File systems
  2. causality
  3. speculative execution
  4. synchronous I/O

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)134
  • Downloads (Last 6 weeks)9
Reflects downloads up to 14 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)MemSnap μCheckpoints: A Data Single Level Store for Fearless PersistenceProceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 310.1145/3620666.3651334(622-638)Online publication date: 27-Apr-2024
  • (2024)DeLiBA-K: Speeding-up Hardware-Accelerated Distributed Storage Access by Tighter Linux Kernel Integration and Use of a Modern APIProceedings of the SC '24 Workshops of the International Conference on High Performance Computing, Network, Storage, and Analysis10.1109/SCW63240.2024.00075(531-544)Online publication date: 17-Nov-2024
  • (2023)KalpaVriksh: Efficient and Cost-effective GUI Application Hosting using Singleton Snapshots2023 IEEE/ACM 23rd International Symposium on Cluster, Cloud and Internet Computing (CCGrid)10.1109/CCGrid57682.2023.00026(180-190)Online publication date: May-2023
  • (2022)L25GCProceedings of the ACM SIGCOMM 2022 Conference10.1145/3544216.3544267(143-157)Online publication date: 22-Aug-2022
  • (2022)Exploiting Nil-external Interfaces for Fast Replicated StorageACM Transactions on Storage10.1145/354282118:3(1-35)Online publication date: 2-Sep-2022
  • (2022)EZEE: Epoch Parallel Zero Knowledge for ANSI C2022 IEEE 7th European Symposium on Security and Privacy (EuroS&P)10.1109/EuroSP53844.2022.00015(109-123)Online publication date: Jun-2022
  • (2022)Controlled Lock ViolationOn Transactional Concurrency Control10.1007/978-3-031-01873-2_5(129-157)Online publication date: 26-Feb-2022
  • (2021)The Aurora Single Level Store Operating SystemProceedings of the ACM SIGOPS 28th Symposium on Operating Systems Principles10.1145/3477132.3483563(788-803)Online publication date: 26-Oct-2021
  • (2021)Distributed Data PersistencyMICRO-54: 54th Annual IEEE/ACM International Symposium on Microarchitecture10.1145/3466752.3480060(71-85)Online publication date: 18-Oct-2021
  • (2021)Better atomic writes by exposing the flash out-of-band area to file systemsProceedings of the 22nd ACM SIGPLAN/SIGBED International Conference on Languages, Compilers, and Tools for Embedded Systems10.1145/3461648.3463843(12-23)Online publication date: 22-Jun-2021
  • Show More Cited By

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media