skip to main content
10.1145/2517349.2522729acmconferencesArticle/Chapter ViewAbstractPublication PagessospConference Proceedingsconference-collections
research-article
Open Access

Transaction chains: achieving serializability with low latency in geo-distributed storage systems

Published:03 November 2013Publication History

ABSTRACT

Currently, users of geo-distributed storage systems face a hard choice between having serializable transactions with high latency, or limited or no transactions with low latency. We show that it is possible to obtain both serializable transactions and low latency, under two conditions. First, transactions are known ahead of time, permitting an a priori static analysis of conflicts. Second, transactions are structured as transaction chains consisting of a sequence of hops, each hop modifying data at one server. To demonstrate this idea, we built Lynx, a geo-distributed storage system that offers transaction chains, secondary indexes, materialized join views, and geo-replication. Lynx uses static analysis to determine if each hop can execute separately while preserving serializability---if so, a client needs wait only for the first hop to complete, which occurs quickly. To evaluate Lynx, we built three applications: an auction service, a Twitter-like microblogging site and a social networking site. These applications successfully use chains to achieve low latency operation and good throughput.

Skip Supplemental Material Section

Supplemental Material

d2-05-yang-zhang.mp4

mp4

1.1 GB

References

  1. http://rubis.ow2.org/index.html as of Oct 2010.Google ScholarGoogle Scholar
  2. Apache cassandra database. http://cassandra.apache.org/.Google ScholarGoogle Scholar
  3. Hbase: Hadoop database. http://hbase.apache.org.Google ScholarGoogle Scholar
  4. MongoDB. http://www.mongodb.com.Google ScholarGoogle Scholar
  5. P. Agrawal, A. Silberstein, B. Cooper, U. Srivastava, and R. Ramakrishnan. Asynchronous view maintenance for VLSD databases. In International Conference on Management of Data, June 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. M. K. Aguilera, A. Merchant, M. Shah, A. Veitch, and C. Karamanolis. Sinfonia: A new paradigm for building scalable distributed systems. ACM Transactions on Computer Systems, 27(3):5:1--5:48, Nov. 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. C. Amza, A. Chanda, A. Cox, S. Elnikety, R. Gil, K. Rajamani, W. Zwaenepoel, E. Cecchet, and J. Marguerite. Specification and implementation of dynamic Web site benchmarks. In IEEE International Workshop on Workload Characterization, Nov. 2002.Google ScholarGoogle ScholarCross RefCross Ref
  8. P. Bailis, A. Fekete, A. Ghodsi, J. M. Hellerstein, and I. Stoica. The potential dangers of causal consistency and an explicit solution. In ACM Symposium on Cloud Computing, Oct. 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. P. Bailis, A. Fekete, A. Ghodsi, J. M. Hellerstein, and I. Stoica. HAT, not CAP: Highly available transactions. In Workshop on Hot Topics in Operating Systems, May 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. J. Baker, C. Bond, J. Corbett, J. Furman, A. Khorlin, J. Larson, J.-M. Leon, Y. Li, A. Lloyd, and V. Yushprakh. Megastore: Providing scalable, highly available storage for interactive services. In Conference on Innovative Database Systems Research, Jan. 2011.Google ScholarGoogle Scholar
  11. A. J. Bernstein, D. S. Gerstl, and P. M. Lewis. Concurrency control for step-decomposed transactions. Information Systems, 24(9):673--698, Dec. 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. P. Bernstein and N. Goodman. Concurrency control in distributed database systems. ACM Computing Survey, 13(2):185--221, June 1981. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. J. Blakeley, P. Larson, and F. Tompa. Efficiently updating materialized views. In International Conference on Management of Data, May 1986. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. W. Bolosky, D. Bradshaw, R. Haagens, N. Kusters, and P. Li. Paxos replicated state machines as the basis of a high-performance data store. In Symposium on Networked Systems Design and Implementation, Mar. 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. H. Boral, W. Alexander, L. Clay, G. Copeland, S. Danforth, M. Franklin, B. Hart, M. Smith, and P. Valduriez. Prototyping Bubba, a highly available parallel database system. Transactions on Knowledge and Data Engineering, 2(1):4--24, Mar. 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Y. Breitbart, H. Garcia-Molina, and A. Silberschatz. Overview of multidatabase transaction management. The VLDB Journal, 1(2):181--239, Oct. 1992. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. M. Cafarella, E. Chang, A. Fikes, A. Halevy, W. Hsieh, A. Lerner, J. Madhavan, and S. Muthukrishnan. Data management projects at Google. In International Conference on Management of Data, June 2008.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. F. Chang, J. Dean, S. Ghemawat, W. C. Hsieh, D. A. Wallach, M. Burrows, T. Chandra, A. Fikes, and R. E. Gruber. Bigtable: A distributed storage system for structured data. In Symposium on Operating Systems Design and Implementation, Nov. 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. J. Cipar, G. Ganger, K. Keeton, C. B. Morrey, III, C. A. N. Soules, and A. Veitch. LazyBase: Trading freshness for performance in a scalable database. In European Conference on Computer Systems, Apr. 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. L. Colby, T. Griffin, L. Libkin, I. Mumick, and H. Trickey. Algorithms for deferred view maintenance. In International Conference on Management of Data, June 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. B. F. Cooper, R. Ramakrishnan, U. Srivastava, A. Silberstein, P. Bohannon, H.-A. Jacobsen, N. Puz, D. Weaver, and R. Yerneni. PNUTS: Yahoo!'s hosted data serving platform. 1(2):1277--1288, Aug. 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. J. Corbett et al. Spanner: Google's globally-distributed database. In Symposium on Operating Systems Design and Implementation, Oct. 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. S. Davies. Data processing spheres of control. IBM Systems Journal, 17(2):179--198, June 1978. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. G. DeCandia et al. Dynamo: Amazon's highly available key-value store. In ACM Symposium on Operating Systems Principles, Oct. 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. D. Dewitt, S. Ghandeharizadeh, D. Schneider, A. Bricker, H. i Hsiao, and R. Rasmussen. The Gamma database machine project. Transactions on Knowledge and Data Engineering, 2(1):44--62, Mar. 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. D. DeWitt and J. Gray. Parallel database systems: The future of high performance database systems. Communications of the ACM, 35(6):85--98, June 1992. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. R. Escriva, B. Wong, and E. G. Sirer. HyperDex: A distributed, searchable key-value store for cloud computing. In ACM SIGCOMM Conference, Aug. 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. M. Franklin. Concurrency control and recovery. The Computer Science and Engineering Handbook, pages 1058--1077, 1997.Google ScholarGoogle Scholar
  29. H. Garcia-Molina. Using semantic knowledge for transaction processing in a distributed database. ACM Transactions on Database Systems, 8(2):186--213, June 1983. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. H. Garcia-Molina and K. Salem. SAGAS. In International Conference on Management of Data, May 1987. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. S. Ghemawat, H. Gobioff, and S.-T. Leung. The Google file system. In ACM Symposium on Operating Systems Principles, Oct. 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. D. K. Gifford. Information storage in a decentralized computer system. Technical Report CSL-81-8, Xerox Parc, Mar. 1982. Extended version of the Ph.D. thesis of D. K. Gifford. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. J. Gray and A. Reuter. Transaction Processing: Concepts and Techniques. Morgan Kaufmann Publishers Inc., San Francisco, USA, 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. H. He, J. Xie, J. Yang, and H. Yu. Asymmetric batch incremental view maintenance. In International Conference on Data Engineering, Apr. 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. E. P. C. Jones, D. J. Abadi, and S. Madden. Low overhead concurrency control for partitioned main memory databases. In International Conference on Management of Data, June 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. N. Kallen. Big data in real time at Twitter. QCon: The annual international software development conference, Nov. 2010. http://www.slideshare.net/nkallen/q-con-3770885.Google ScholarGoogle Scholar
  37. T. Kraska, G. Pang, M. Franklin, S. Madden, and A. Fekete. MDCC: Multi-data center consistency. In European Conference on Computer Systems, Apr. 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. C. Li, D. Porto, A. Clement, R. Rodrigues, N. Preguia, and J. Gehrke. Making geo-replicated systems fast if possible, consistent when necessary. In Symposium on Operating Systems Design and Implementation, Oct. 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. W. Lloyd, M. Freedman, M. Kaminsky, and D. Andersen. Don't settle for eventual: Stronger consistency for wide-area storage with COPS. In ACM Symposium on Operating Systems Principles, Oct. 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. W. Lloyd, M. Freedman, M. Kaminsky, and D. Andersen. Stronger semantics for low-latency geo-replicated storage. In Symposium on Networked Systems Design and Implementation, Apr. 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. H. Mahmoud, F. Nawab, A. Pucher, D. Agrawal, and A. E. Abbadi. Low-latency multi-datacenter databases using replicated commit. Proceedings of the VLDB Endowment, 6(9):661--672, July 2013.Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. C. Mohan, B. Lindsay, and R. Obermarck. Transaction management in the R* distributed database management system. ACM Transactions on Database Systems, 11(4):378--396, Dec. 1986. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. D. Peng and F. Dabek. Incremental processing of large data sets. In Symposium on Operating Systems Design and Implementation, Oct. 2010.Google ScholarGoogle Scholar
  44. D. Pritchett. BASE: An acid alternative. ACM Queue, 6(3):48--55, May 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. D. Quass and J. Widom. On-line warehouse view maintenance. In International Conference on Management of Data, May 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. E. Schurman and J. Brutlag. The user and business impact of server delays, additional bytes, and HTTP chunking in web search. In Velocity Web Performance and Operations Conference, June 2009.Google ScholarGoogle Scholar
  47. M. Shapiro, N. Preguiça, C. Baquero, and M. Zawirski. Conflict-free replicated data types. In International Symposium on Stabilization, Safety, and Security of Distributed Systems, Oct. 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. D. Shasha, F. Llirbat, E. Simon, and P. Valduriez. Transaction chopping: Algorithms and performance studies. ACM Transactions on Database Systems, 20(3):325--363, Sept. 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. A. Silberstein, J. Terrace, B. Cooper, and R. Ramakrishnan. Feeding frenzy: Selectively materializing users' event feeds. In International Conference on Management of Data, June 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Y. Sovran, R. Power, M. K. Aguilera, and J. Li. Transactional storage for geo-replicated systems. In ACM Symposium on Operating Systems Principles, Oct. 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. J. Stribling, Y. Sovran, I. Zhang, X. Pretzer, J. Li, F. Kaashoek, and R. Morris. Simplifying wide-area application development with WheelFS. In Symposium on Networked Systems Design and Implementation, Apr. 2009.Google ScholarGoogle Scholar
  52. D. Terry, A. Demers, K. Petersen, M. Spreitzer, M. Theimer, and B. Welch. Session guarantees for weakly consistent replicated data. In International Conference on Parallel and Distributed Information Systems, Sept. 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. A. Thomson and D. J. Abadi. The case for determinism in database systems. Proceedings of the VLDB Endowment, 3(1):70--80, Sept. 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. G. Weikum and G. Vossen. Transactional information systems: theory, algorithms, and the practice of concurrency control and recovery. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. J. Zhou, P. Larson, and H. Elmongui. Lazy maintenance of materialized views. In International Conference on Very Large Data Bases, Sept. 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in
  • Published in

    cover image ACM Conferences
    SOSP '13: Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles
    November 2013
    498 pages
    ISBN:9781450323888
    DOI:10.1145/2517349

    Copyright © 2013 Owner/Author

    Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 3 November 2013

    Check for updates

    Qualifiers

    • research-article

    Acceptance Rates

    Overall Acceptance Rate131of716submissions,18%

    Upcoming Conference

    SOSP '24

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader