skip to main content
article
Free Access

A new approach to developing and implementing eager database replication protocols

Authors Info & Claims
Published:01 September 2000Publication History
Skip Abstract Section

Abstract

Database replication is traditionally seen as a way to increase the availability and performance of distributed databases. Although a large number of protocols providing data consistency and fault-tolerance have been proposed, few of these ideas have ever been used in commercial products due to their complexity and performance implications. Instead, current products allow inconsistencies and often resort to centralized approaches which eliminates some of the advantages of replication. As an alternative, we propose a suite of replication protocols that addresses the main problems related to database replication. On the one hand, our protocols maintain data consistency and the same transactional semantics found in centralized systems. On the other hand, they provide flexibility and reasonable performance. To do so, our protocols take advantage of the rich semantics of group communication primitives and the relaxed isolation guarantees provided by most databases. This allows us to eliminate the possibility of deadlocks, reduce the message overhead and increase performance. A detailed simulation study shows the feasibility of the approach and the flexibility with which different types of bottlenecks can be circumvented.

References

  1. ADYA, A. 1999. Weak consistency: A generalized theory and optimistic implementations of distributed transactions. Ph.D. Dissertation. MIT Press, Cambridge, MA. Google ScholarGoogle Scholar
  2. ADYA, A., LISKOV, B., AND O'NEIL, P. 2000. Generalized isolation level definitions. In Proceedings of the 16th International Conference on Data Engineering (ICDE, San Diego, CA, Feb/Mar). 67-78. Google ScholarGoogle Scholar
  3. AGRAWAL, D., ALONSO, G., EL ABBADI, A., AND STANOI, I. 1997. Exploiting atomic broadcast in replicated databases. In Proceedings of the European Conference on Parallel Processing (Euro-Par, Passau, Germany, Aug.). 496-503. Google ScholarGoogle Scholar
  4. AGRAWAL, D., EL ABBADI, A., AND STEINKE, R. C. 1997. Epidemic algorithms in replicated databases (extended abstract). In Proceedings of the 16th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems (PODS '97, Tucson, AZ, May 12-14), A. Mendelzon and Z. M. ~zsoyoglu, Chairs. ACM Press, New York, NY, 161-172. Google ScholarGoogle Scholar
  5. AGRAWAL, D. AND EL ABBADI, A. 1990. The tree quorum protocol: An efficient approach for managing replicated data. In Proceedings of the 16th International Conference on Very Large Data Bases (VLDB, Brisbane, Australia, Aug. 13-16), D. McLeod, R. Sacks-Davis, and H. Schek, Eds. Morgan Kaufmann Publishers Inc., San Francisco, CA, 243-254. Google ScholarGoogle Scholar
  6. AGRAWAL, R., CAREY,M.J.,AND LIVNY, M. 1987. Concurrency control performance modeling: Alternatives and implications. ACM Trans. Database Syst. 12, 4 (Dec.), 609-654. Google ScholarGoogle Scholar
  7. ALONSO, G. 1997. Partial database replication and group communication primitives. In Proceedings of European Research Seminar on Advances in Distributed Systems (Zinal, Switzerland, Mar.).Google ScholarGoogle Scholar
  8. ALONSO, G., REINWALD, B., AND MOHAN, C. 1997. Distributed data management in workflow environment. In Proceedings of the 13th International Conference Workshop on Research Issues in Data Engineering (RIDE, Birmingham, UK, Apr.). IEEE Computer Society, Washington, DC. Google ScholarGoogle Scholar
  9. ALONSO, G., AGRAWAL, D., ABBADI,A.E.,KAMATH, M., KAMATH, M., GUNTHOR, R., AND MOHAN, C. 1996. Advanced transaction models in the workflow contexts. In Proceedings of the 12th IEEE International Conference on Data Engineering (ICDE '97, New Orleans, LA, Feb.). IEEE Press, Piscataway, NJ, 574-581. Google ScholarGoogle Scholar
  10. ALSBERG, P. AND DAY, J. 1976. A principle for resilient sharing of distributed resources. In Proceedings of the International Conference on Software Engineering (San Francisco, CA, Oct.). 562-570. Google ScholarGoogle Scholar
  11. ANDERSON, T. A., BREITBART, Y., KORTH, H. F., AND WOOL, A. 1998. Replication, consistency, and practicality: Are these mutually exclusive? In Proceedings of ACM SIGMOD International Conference on Management of Data (SIGMOD '98, Seattle, WA, June 1-4), A. Tiwary and M. Franklin, Eds. ACM Press, New York, NY, 484-495. Google ScholarGoogle Scholar
  12. ANSI. 1992. X3.135-1992. American National Standard for Information Systems-Database Languages-SQL.Google ScholarGoogle Scholar
  13. BEERI, C., BERNSTEIN, P. A., AND GOODMAN, N. 1989. A model for concurrency in nested transactions systems. J. ACM 36, 2 (Apr.), 230-269. Google ScholarGoogle Scholar
  14. BERENSON, H., BERNSTEIN, P., GRAY, J., MELTON, J., O'NEIL, E., AND O'NEIL, P. 1995. A critique of ANSI SQL isolation levels. In Proceedings of the 1995 ACM SIGMOD International Conference on Management of Data (SIGMOD '95, San Jose, CA, May 23-25), M. Carey and D. Schneider, Eds. ACM Press, New York, NY, 1-10. Google ScholarGoogle Scholar
  15. BERNSTEIN, P. A., HADZILACOS, V., AND GOODMAN, N. 1987. Concurrency Control and Recovery in Database Systems. Addison-Wesley Longman Publ. Co., Inc., Reading, MA. Google ScholarGoogle Scholar
  16. BIRMAN, K., SCHIPER, A., AND STEPHENSON, P. 1991. Lightweight causal and atomic group multicast. ACM Trans. Comput. Syst. 9, 3 (Aug.), 272-314. Google ScholarGoogle Scholar
  17. BREITBART, Y., BREITBART, Y., KOMONDOOR, R., RASTOGI, R., AND SILBERSCHATZ, A. 1999. Update propagation protocols for replicated databases. In Proceedings of the 1999 ACM Conference on Management of Data (SIGMOD '99, Philadelphia, PA, June). ACM Press, New York, NY, 97-108. Google ScholarGoogle Scholar
  18. BREITBART, Y. AND KORTH, H. F. 1997. Replication and consistency: being lazy helps sometimes. In Proceedings of the 16th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems (PODS '97, Tucson, AZ, May 12-14), A. Mendelzon and Z. M. ~zsoyoglu, Chairs. ACM Press, New York, NY, 173-184. Google ScholarGoogle Scholar
  19. BRIDGE, W., JOSHI, A., KEIHL, M., LAHIRI, T., LOAIZA, J., AND MACNAUGHTON, N. 1997. The Oracle universal server buffer manager. In Proceedings of the 23rd International Conference on Very Large Data Bases (VLDB '97, Athens, Greece, Aug.). 590-594. Google ScholarGoogle Scholar
  20. CERI, S., HOUTSMA, M. A., AND KELLER, A. 1991. A classification of update methods for replicated databases. Tech. Rep. CS-TR-91-1392. Computer Systems Laboratory, Stanford University, Stanford, CA. Google ScholarGoogle Scholar
  21. CHANDRA, T. D. AND TOUEG, S. 1991. Unreliable failure detectors for asynchronous systems (preliminary version). In Proceedings of the 10th Annual ACM Symposium on Principles of Distributed Computing (PODC '91, Montreal, Que., Canada, Aug. 19-21), L. Logrippo, Chair. ACM Press, New York, NY, 325-340. Google ScholarGoogle Scholar
  22. CHEN, S. -W. AND PU, C. 1992. A structural classification of integrated replica control mechanisms. Tech. Rep. CUCS-006-92. Columbia University, New York, NY.Google ScholarGoogle Scholar
  23. CHEUNG, S. Y., AHAMAD, M., AND AMMAR, M. H. 1990. The grid protocol: A high performance scheme for maintaining replicated data. In Proceedings of the International Conference on Data Engineering (ICDE, Los Angeles, CA, Feb.). 438-445. Google ScholarGoogle Scholar
  24. CHUNDI, P., ROSENKRANTZ, D. J., AND RAVI, S. S. 1996. Deferred updates and data placement in distributed databases. In Proceedings of the 12th IEEE International Conference on Data Engineering (ICDE '97, New Orleans, LA, Feb.). IEEE Press, Piscataway, NJ, 469-476. Google ScholarGoogle Scholar
  25. DOLEV, D. AND MALKI, D. 1996. The Transis approach to high availability cluster communication. Commun. ACM 39, 4, 63-70. Google ScholarGoogle Scholar
  26. EL ABBADI, A. AND TOUEG, S. 1989. Maintaining availability in partitioned replicated databases. ACM Trans. Database Syst. 14, 2 (June), 264-290. Google ScholarGoogle Scholar
  27. ESWARAN, K. P., GRAY, J. N., LORIE, R. A., AND TRAIGER, I. 1976. The notions of consistency and predicate locks in a database system. Commun. ACM 19, 11, 624-633. Google ScholarGoogle Scholar
  28. FRIEDMAN, R. AND VAN RENESSE, R. 1995a. Packing messages as a tool for boosting the performance of total ordering protocols. Tech. Rep. TR-95-1527. Department of Computer Science, Cornell University, Ithaca, NY. Google ScholarGoogle Scholar
  29. FRIEDMAN, R. AND VAN RENESSE, R. 1995b. Strong and weak virtual synchrony in Horus. Tech Rep. TR-95-1537. Department of Computer Science, Cornell University, Ithaca, NY.Google ScholarGoogle Scholar
  30. GIFFORD, D. K. 1979. Weighted voting for replicated data. In Proceedings of the 7th Symposium on Operating Systems Principles (Asilomar, CA, Dec.). 150-162. Google ScholarGoogle Scholar
  31. GOLDRING, R. 1994. A discussion of relational database replication technology. InfoDB 8,1.Google ScholarGoogle Scholar
  32. GRAY, J., HELLAND, P., O'NEIL, P., AND SHASHA, D. 1996. The dangers of replication and a solution. In Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD '96, Montreal, Canada). ACM, New York, NY. Google ScholarGoogle Scholar
  33. GRAY, J., LORIE, R., PUTZOLU, G., AND TRAIGER, I. 1976. Granularity of locks and degrees of consistency in a shared database. In Modeling in Data Base Mangement Systems. Elsevier North-Holland, Inc., Amsterdam, The Netherlands.Google ScholarGoogle Scholar
  34. GRAY, J. AND REUTER, A. 1993. Transaction Processing: Concepts and Techniques. Morgan Kaufmann Publishers Inc., San Francisco, CA. Google ScholarGoogle Scholar
  35. GUERRAOUI, R. AND SCHIPER, A. 1995. Transaction model vs virtual synchrony model: Bridging the gap. In Theory and Practice in Distributed Systems. Springer-Verlag, New York, NY, 121-132. Google ScholarGoogle Scholar
  36. GUPTA, R., HARITSA, J., AND RAMAMRITHAM, K. 1997. Revisiting commit processing in distributed database systems. In Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD '97, Tucson, AZ, May). ACM Press, New York, NY, 486-497. Google ScholarGoogle Scholar
  37. HADZILACOS, V. AND TOUEG, S. 1993. Fault-tolerant broadcasts and related problems. In Distributed Systems (2nd ed.), S. Mullender, Ed. Addison-Wesley Longman Publ. Co., Inc., Reading, MA, 97-145. Google ScholarGoogle Scholar
  38. HOLLIDAY, J., AGRAWAL, D., AND ABBADI, A. E. 1999. The performance of database replication with group multicast. In Proceedings of the International Symposium on Fault-Tolerant Computing (FTCS, Madison, Wisconsin, June). 158-165. Google ScholarGoogle Scholar
  39. KEMME, B. AND ALONSO, G. 1998. A suite of database replication protocols based on group communication primitives. In Proceedings of the 18th IEEE International Conference on Distributed Computing Systems (ICDCS '98, Amsterdam, The Netherlands, May). IEEE Computer Society Press, Los Alamitos, CA. Google ScholarGoogle Scholar
  40. KEMME, B., PEDONE, F., ALONSO, G., AND SCHIPER, A. 1999. Processing transactions over optimistic atomic broadcast protocols. In Proceedings of the International Conference on Distributed Computing Systems (ICDCS '99, Austin, TX, June). Google ScholarGoogle Scholar
  41. KRISHNAKUMAR, N. AND BERNSTEIN, A. J. 1991. Bounded ignorance in replicated systems. In Proceedings of the Tenth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of database systems (PODS '91, Denver, CO, May 29-31), D. J. Rosenkrantz, Chair. ACM Press, New York, NY, 63-74. Google ScholarGoogle Scholar
  42. LAMPORT, L. 1978. Time, clocks, and the ordering of events in a distributed system. Commun. ACM 21, 7, 558-565. Google ScholarGoogle Scholar
  43. LIVNY, M. 1989. Parallelism and concurrency control performance in distributed database machines. In Proceedings of the 1989 ACM SIGMOD International Conference on Management of Data (SIGMOD '89, Portland, OR, June), J. Clifford, B. Lindsay, and D. Maier, Eds. ACM Press, New York, NY, 122-133. Google ScholarGoogle Scholar
  44. MAEKAWA, M. 1985. A IN algorithm for mutual exclusion in decentralized systems. ACM Trans. Comput. Syst. 3, 2 (May), 145-159. Google ScholarGoogle Scholar
  45. MOSER, L. E., AMIR, Y., MELLIAR-SMITH, P. M., AND AGARWAL, D. A. 1994. Extended virtual synchrony. In Proceedings of the 14th IEEE International Conference on Distributed Computing Systems (ICDCS '94, Poznan, Poland, June). IEEE Computer Society Press, Los Alamitos, CA, 56-65.Google ScholarGoogle Scholar
  46. MOSER, L. E., MELLIAR-SMITH, P. M., AGARWAL, D. A., BUDHIA, R. K., AND LINGLEY-PAPADOPOU- LOS, C. A. 1996. Totem: a fault-tolerant multicast group communication system. Commun. ACM 39,4,54-63. Google ScholarGoogle Scholar
  47. NEIGER, G. AND TOUEG, S. 1988. Automatically increasing the fault-tolerance of distributed systems. In Proceedings of the Seventh Annual ACM Symposium on Principles of Distributed Computing (PODC '88, Toronto, Ont., Canada, Aug. 15-17), D. Dolev, Chair. ACM Press, New York, NY, 248-262. Google ScholarGoogle Scholar
  48. ORACLE. 1995. Concurrency control, transaction isolation and serializability in SQL92 and Oracle7. White paper.Google ScholarGoogle Scholar
  49. ORACLE. 1997. Oracle8(TM) Server Replication: Concepts Manual.Google ScholarGoogle Scholar
  50. PACITTI, E., MINET, P., AND SIMON, E. 1999. Fast algorithms for maintaining replica consistency in lazy master replicated databases. In Proceedings of the International Conference on Very Large Data Bases (VLDB, Edinburgh, Scotland, Sept.). 126-137. Google ScholarGoogle Scholar
  51. PEDONE, F., GUERRAOUI, R., AND SCHIPER, A. 1997. Transaction reordering in replicated databases. In Proceedings of the Symposium on Reliable Distributed Systems (SRDS, Durham, NC, Oct.). Google ScholarGoogle Scholar
  52. PU, C. AND LEFF, A. 1991. Replica control in distributed systems: An asynchronous approach. In Proceedings of the 1991 ACM SIGMOD International Conference on Management of Data (SIGMOD '91, Denver, CO, May 29-31), J. Clifford and R. King, Eds. ACM Press, New York, NY, 377-386. Google ScholarGoogle Scholar
  53. RABINOVICH, M., GEHANI, N. H., AND KONONOV, A. 1996. Scalable update propagation in epidemic replicated databases. In Proceedings of the International Conference on Extending Data Base Technology (EDBT, Avignon, France). 207-222. Google ScholarGoogle Scholar
  54. SCHIPER, A. AND SANDOZ, A. 1993. Uniform reliable multicast in a virtually synchronous environment. In Proceedings of the 13th International Conference on Distributed Computing Systems (ICDCS '93, Pittsburgh, PA). 561-568.Google ScholarGoogle Scholar
  55. SHASHA, D. 1997. Lessons from Wall Street. In Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD '97, Tucson, AZ, May). ACM Press, New York, NY, 498-501. Google ScholarGoogle Scholar
  56. SIDELL, J., AOKI, P., SAH, A., STAELIN, C., STONEBRAKER, M., AND YU, A. 1996. Data replication in Mariposa. In Proceedings of the 12th IEEE International Conference on Data Engineering (ICDE '97, New Orleans, LA, Feb.). IEEE Press, Piscataway, NJ, 485-494. Google ScholarGoogle Scholar
  57. STACEY, D. 1994. Replication: DB2, Oracle, or Sybase. Database Program. Des. 7, 12.Google ScholarGoogle Scholar
  58. STANOI, I., AGRAWAL, D., AND EL ABBADI, A. 1998. Using broadcast primitives in replicated databases. In Proceedings of the 18th IEEE International Conference on Distributed Computing Systems (ICDCS '98, Amsterdam, The Netherlands, May). IEEE Computer Society Press, Los Alamitos, CA. Google ScholarGoogle Scholar
  59. STONEBRAKER, M. 1979. Concurrency control and consistency of multiple copies of data in distributed INGRES. IEEE Trans. Softw. Eng. SE-5, 3 (May), 188-194.Google ScholarGoogle Scholar
  60. VAN RENESSE, R., BIRMAN, K. P., AND MAFFEIS, S. 1996. Horus: a flexible group communication system. Commun. ACM 39, 4, 76-83. Google ScholarGoogle Scholar
  61. WHITNEY, A., SHASHA, D., AND APTER, S. 1997. High volume transaction processing without concurrency control, two phase commit, SQL or C++. In Proceedings of the International Workshop on High Performance Transaction Systems (Asilomar, CA, Sept.).Google ScholarGoogle Scholar

Index Terms

  1. A new approach to developing and implementing eager database replication protocols

                    Recommendations

                    Reviews

                    Edward Y. Lee

                    Many researchers and authors have visited the subject of database replication protocols since the early 1970's, mostly in conjunction with the discussions of distributed databases. This topic is difficult because there are conflicting requirements among the performance, reliability, data consistency and fault-tolerance issues for these protocols. Many complex schemes were proposed but few were implemented by the commercially available database management systems (DBMS). Following the methodology first introduced by Gray et al; the authors classified the replication mechanism into four classes as the four quadrants of the matrix (see Table I in the paper). They are either classified as eager or lazy on the time frame of the replication update (the x-axis). As to where the replications are being applied, they are divided into either primary copy or update everywhere (the y-axis). The primary copy is the centralized approach while the update everywhere is the distributed approach. Kemme and Alonso maintain that their protocols follow a solution based on a combination of ideas from group communication and concurrency control. They rely on the group communication systems that provide the group maintenance, reliable message exchange, and message-ordering primitives between a group of nodes in a distributed system. Using the read-one/write-all-available (ROWAA) algorithm, they are able to reduce the number of messages per transaction. The group communication also ensures that all messages are received in the same total order at all sites. Assuming all operations of a transaction are sent in a single message, the transactions will arrive at all sites in the same order. By granting the locks in the order of the transaction arrival one can guarantee that all sites perform the same updates and in exactly the same order to avoid deadlock. The next improvement the authors attempted is to lessen the serialization restrictions that are implemented in the commercial DBMS. They use alternative correctness criteria that allow lower levels of isolation. The different levels of isolation allow a trade-off of performance vs. correctness and allow more optimal implementation depending on the situations. The last improvement of their protocols is to optimize the use of different levels of fault-tolerance. This approach is similar to the different levels of isolation. Full correctness can be weakened to provide faster solution as a trade-off. The reliability of message delivery will determine the overall correctness. It is true that while complex message exchange mechanisms guarantee the atomicity of transactions beyond failures, faster communication protocols only provide a best effort approach. Here is the summary of the 9 sections of the paper: Section 1 gives the introduction; section 2 provides an overview of the existing replication solutions. Section 3 presents the basic conception of their protocols. Section 4 describes the system model. Section 5 presents a family of protocols providing different levels of isolation and Section 6 redefines the algorithms to provide different levels of fault-tolerance. Section 7 provides an overview of the simulation system, which provides analyses of the protocols. Section 8 describes the conducted experiments and their results. Section 9 gives the conclusions of the paper. Appendix A and B provide the more formal proofs of the protocols in sections 5 and 6. There are over 60 references at the end of the paper. This is a fairly comprehensive paper on the eager database replication protocols, which are important in implementing distributed DBMS in a network of nodes. This paper should be read by the DBMS pratitioners who are interested in more practical solutions to the complex world of database replication protocols in a real world environment where compromises are mandatory.

                    Access critical reviews of Computing literature here

                    Become a reviewer for Computing Reviews.

                    Comments

                    Login options

                    Check if you have access through your login credentials or your institution to get full access on this article.

                    Sign in

                    Full Access

                    PDF Format

                    View or Download as a PDF file.

                    PDF

                    eReader

                    View online with eReader.

                    eReader