ABSTRACT
The problem. We consider the problem of implementing a consistent replicated object in a partially synchronous message passing distributed system susceptible to process and communication failures. The object is a generic shared resource, such as a data structure, a file, or a lock. The processes implementing the replicated object access it by applying operations to it at unpredictable times and potentially concurrently.1 The object should be linearizable: it should behave as if each operation applied to it takes effect at a distinct instant in time during the interval between its invocation and its response.
The main reason for replicating an object is fault tolerance; if a copy of the object becomes inaccessible, the object can still be accessed through the other copies. Another reason for replicating an object is performance; a process that requires the object can access its local copy. The benefit of accessing a local copy, however, must be weighed against the cost of keeping the copies consistent.
- Marcos K. Aguilera, Carole Delporte-Gallet, Hugues Fauconnier, and Sam Toueg. On implementing Omega in systems with weak reliability and synchrony assumptions. Dist. Comp., 21(4):285--314, 2008.Google ScholarDigital Library
- Mike Burrows. The Chubby lock service for loosely-coupled distributed systems. In OSDI '06, pages 335--350, 2006. Google ScholarDigital Library
- Tushar D. Chandra, Robert Griesemer, and Joshua Redstone. Paxos made live: An engineering perspective. In PODC '07, pages 398--407, 2007. Google ScholarDigital Library
- Tushar D. Chandra, Vassos Hadzilacos, and Sam Toueg. The weakest failure detector for solving consensus. JACM, 43(4):685--722, 1996. Google ScholarDigital Library
- Tushar D. Chandra and Sam Toueg. Unreliable failure detectors for reliable distributed systems. JACM, 43(2):225--267, 1996. Google ScholarDigital Library
- Cynthia Dwork, Nancy A. Lynch, and Larry Stockmeyer. Consensus in the presence of partial synchrony. JACM, 35(2):288--323, 1988. Google ScholarDigital Library
- James Corbett et al. Spanner: Google's globally-distributed database. In OSDI '12, pages 261--264, 2012. Google ScholarDigital Library
- Jason Baker et al. Megastore: Providing scalable, highly available storage for interactive services. In CIDR '11, pages 223--234, 2011.Google Scholar
- Leslie Lamport. Time, clocks, and the ordering of events in a distributed system. CACM, 21(7):558--565, 1978. Google ScholarDigital Library
- Leslie Lamport. The part-time parliament. ACM TOCS, 16(2):133--169, 1998. Google ScholarDigital Library
- Barbara Liskov and James Cowling. Viewstamped replication revisited. Technical Report MIT-CSAIL-TR-2012-021, 2012.Google Scholar
- Iulian Moraru, David Anderson, and Michael Kaminsky. Paxos quorum leases: Fast reads without sacrificing writes. In SoCC '14, pages 1--13, 2014. Google ScholarDigital Library
- Brian Oki and Barbara Liskov. Viewstamped replication: A new primary copy method to support highly-available distributed systems. In PODC '88, pages 8--17, 1988. Google ScholarDigital Library
- Diego Ongaro and John Ousterhout. In search of an understandable consensus algorithm. In USENIX ATC'14, pages 305--320, 2014. An extended version of this paper is in phhttp://ramcloud.stanford.edu/raft.pdf. Google ScholarDigital Library
- Nicolas Schiper and Sam Toueg. A robust and lightweight stable leader election service for dynamic systems. In DSN '08, pages 207--216, 2008.Google ScholarCross Ref
Index Terms
- An Algorithm for Replicated Objects with Efficient Reads
Recommendations
Consistent and automatic replica regeneration
Reducing management costs and improving the availability of large-scale distributed systems require automatic replica regeneration, that is, creating new replicas in response to replica failures. A major challenge to regeneration is maintaining ...
Building Replicated Internet Services Using TACT: A Toolkit for Tunable Availability and Consistency Tradeoffs
WECWIS '00: Proceedings of the Second International Workshop on Advance Issues of E-Commerce and Web-Based Information Systems (WECWIS 2000)An ultimate goal for modern Internet services is the development of scalable, high-performance, highly available and fault-tolerant systems. Replication is an important approach to achieve this goal. However, replication introduces the issue of ...
Towards generic and middleware-independent support for replicated, distributed objects
MAI '07: Proceedings of the 1st workshop on Middleware-application interaction: in conjunction with Euro-Sys 2007Replication is a commonly used approach to increase the availability of distributed services, which is a non-functional requirement. Thus, replication is in principle independent of the application logic. For this reason, support for replication is part ...
Comments