Abstract
The evolution of distributed computing and applications has put new challenges on models, architectures and systems. To name just one, 'reconciling uncertainty with predictability' is required by today's simultaneous pressure on increasing the quality of service of applications, and on degrading the assurance given by the infrastructure.This challenge can be mapped onto more than one facet, such as time or security or others. In this paper we explore the time facet, reviewing past and present of distributed systems models, and making the case for the use of hybrid (vs. homogeneous) models, as a key to overcoming some of the difficulties faced when asynchronous models (uncertainty) meet timing specifications (predictability). The Wormholes paradigm is described as the first experiment with hybrid distributed systems models.
- M. Aguilera, G. Le Lann, and S. Toueg. On the impact of fast failure detectors on real-time fault-tolerant systems. In Proc. of DISC 2002, October 2002.]] Google ScholarDigital Library
- E. Anceaume, B. Charron-Bost, P. Minet, and S. Toueg. On the formal specification of group membership services. Technical Report RR-2695, INRIA, Rocquencourt, France, November 1995.]]Google Scholar
- M. Ben-Or. Another advantage of free choice: Completely asynchronous agreement protocols. In Proceedings of the 2nd ACM Symposium on Principles of Distributed Computing, pages 27--30, August 1983.]] Google ScholarDigital Library
- A. Casimiro, P. Martins, and P. Veríssimo. How to build a Timely Computing Base using Real-Time Linux. In Proceedings of the IEEE International Workshop on Factory Communication Systems, pages 127--134, September 2000.]]Google ScholarCross Ref
- M. Castro and B. Liskov. Practical Byzantine fault tolerance and proactive recovery. ACM Transactions on Computer Systems, 20(4):398--461, November 2002.]] Google ScholarDigital Library
- T. Chandra, V. Hadzilacos, S. Toueg, and B. Charron-Bost. On the impossibility of group membership. In Proceedings of the 15th ACM Symposium on Principles of Distributed Computing, pages 322--330, May 1996.]] Google ScholarDigital Library
- T. Chandra and S. Toueg. Unreliable failure detectors for reliable distributed systems. Journal of the ACM, 43(2):225--267, March 1996.]] Google ScholarDigital Library
- F. Christian and C. Fetzer. The timed asynchronous system model. In Proceedings of the 28th IEEE International Symposium on Fault-Tolerant Computing, pages 140--149, 1998.]] Google ScholarDigital Library
- M. Correia, N. F. Neves, L. C. Lung, and P. Veríssimo. Low complexity Byzantine-resilient consensus. Distributed Computing, 17(3):237--249, 2005.]] Google ScholarDigital Library
- M. Correia, P. Veríssimo, and N. F. Neves. The design of a COTS real-time distributed security kernel. In Proceedings of the Fourth European Dependable Computing Conference, pages 234--252, October 2002.]] Google ScholarDigital Library
- C. Delporte-Gallet, H. Fauconnier, and R. Guerraoui. A realistic look at failure detectors. In Proceedings of the International Conference on Dependable Systems and Networks, pages 213--222, Washington, USA, June 2002.]] Google ScholarDigital Library
- D. Dolev, C. Dwork, and L. Stockmeyer. On the minimal synchronism needed for distributed consensus. Journal of the ACM, 34(1):77--97, January 1987.]] Google ScholarDigital Library
- C. Dwork, N. Lynch, and L. Stockmeyer. Consensus in the presence of partial synchrony. Journal of the ACM, 35(2):288--323, April 1988.]] Google ScholarDigital Library
- Christof Fetzer. Perfect failure detection in timed asynchronous systems. IEEE Trans. Comput., 52(2):99--112, 2003.]] Google ScholarDigital Library
- M. J. Fischer, N. A. Lynch, and M. S. Paterson. Impossibility of distributed consensus with one faulty process. Journal of the ACM, 32(2):374--382, April 1985.]] Google ScholarDigital Library
- R. Friedman, A. Moustefaoui, S. Rajsbaum, and M. Raynal. Error correcting codes: A future direction to solve distributed agreement problems? In International Workshop on Future Directions of Distributed Computing, FuDiCo, June 2002.]]Google Scholar
- Roy Friedman, Achour Mostéfaoui, and Michel Raynal. Building and using quorums despite any number of process crashes. In 5th European Dependable Computing Conference (EDCC'05), Budapest, Hungary.]]Google Scholar
- J.M. Helary, M. Hurfin, A. Mostefaoui, M. Raynal, and Tronel F. Computing global functions on asynchronous distributed systems with perfect failure detectors. IEEE Transactions on Parallel and Distributed Systems, 11(9), September 2000.]] Google ScholarDigital Library
- I. Keidar and S. Rajsbaum. On the cost of fault-tolerant consensus when there are no faults - a tutorial. SIGACTN: SIGACT News (ACM Special Interest Group on Automata and Computability Theory), 32(2):45--63, 2001. Preliminary version, MIT Technical Report MIT-LCS-TR-821, May 24, 2001.]] Google ScholarDigital Library
- F. Meyer and D. Pradhan. Consensus with dual failure modes. In Proceedings of the 17th IEEE International Symposium on Fault-Tolerant Computing, pages 214--222, July 1987.]]Google Scholar
- A. Mostéfaoui, E. Mourgaya, and M. Raynal. Asynchronous implementation of failure detectors. In Int. IEEE/IFIP Conference on Dependable Systems and Networks (DSN'03), San Francisco (USA).]]Google Scholar
- N. F. Neves, M. Correia, and P. Veríssimo. Solving vector consensus with a wormhole. IEEE Transactions on Parallel and Distributed Systems, 16(12):1120--1131, December 2005.]] Google ScholarDigital Library
- D. Powell. Fault assumptions and assumption coverage. In Proceedings of the 22nd IEEE International Symposium of Fault-Tolerant Computing, July 1992.]]Google Scholar
- D. Powell, D. Seaton, G. Bonn, P. Veríssimo, and F. Waeselynk. The Delta-4 approach to dependability in open distributed computing systems. In Proceedings of the 18th IEEE International Symposium on Fault-Tolerant Computing, June 1988.]]Google ScholarCross Ref
- M. Raynal. Short introduction to failure detectors for asynchronous distributed systems. SIGACTN: SIGACT News (ACM Special Interest Group on Automata and Computability Theory), 36(1):53--70, 2005.]] Google ScholarDigital Library
- Nicola Santoro and Peter Widmayer. Majority and unanimity in synchronous networks with ubiquitous dynamic faults. In SIROCCO, pages 262--276, 2005.]] Google ScholarDigital Library
- P. Sousa, N. F. Neves, and P. Verissimo. How resilient are distributed f fault/intrusion-tolerant systems? In Proceedings of the IEEE International Conference on Dependable Systems and Networks, June 2005.]] Google ScholarDigital Library
- Jan van Leeuwen and Jir Wiedermann. Beyond the turing limit: Evolving interactive systems. In Leszek Pacholski and Peter Ruzicka, editors, SOFSEM: Theory and Practice of Informatics, 28th Conference on Current Trends in Theory and Practice of Informatics, volume 2234 of Lecture Notes in Computer Science, pages 90--109, Piestany, Slovak Republic, 2001. Springer.]] Google ScholarDigital Library
- P. Veríssimo. Uncertainty and predictability: Can they be reconciled? In Future Directions in Distributed Computing, volume 2584 of Lecture Notes in Computer Science, pages 108--113. Springer-Verlag, 2003.]]Google Scholar
- P. Veríssimo and A. Casimiro. The Timely Computing Base model and architecture. IEEE Transactions on Computers, 51(8):916--930, August 2002. Supersedes Tech. Rep. DI/FCUL TR-99-2, Dpt. of Informatics, University of Lisboa, May 1999.]] Google ScholarDigital Library
- P. Veríssimo, A. Casimiro, and C. Fetzer. The Timely Computing Base: Timely actions in the presence of uncertain timeliness. In Proceedings of the International Conference on Dependable Systems and Networks, pages 533--542, June 2000.]] Google ScholarDigital Library
- P. Veríssimo and L. Rodrigues. Distributed Systems for System Architects. Kluwer Academic Publishers, 2001.]] Google ScholarDigital Library
- P. Veríssimo, L. Rodrigues, and A. Casimiro. Cesiumspray: a precise and accurate global clock service for large-scale systems. Journal of Real-Time Systems, 12(3):243--294, May 1997.]] Google ScholarDigital Library
- C. Walter, N. Suri, and M. Hugue. Continual on-line diagnosis of hybrid faults. In Proceedings of the 4th IFIP International Working Conference on Dependable Computing for Critical Applications, 1994.]]Google Scholar
- L. Zhou, F. Schneider, and R. van Renesse. COCA: A secure distributed on-line certification authority. ACM Transactions on Computer Systems, 20(4):329--368, November 2002.]] Google ScholarDigital Library
Index Terms
- Travelling through wormholes: a new look at distributed systems models
Recommendations
The Binested Inequalities for the Symmetric Traveling Salesman Polytope
In this paper we define a family of valid inequalities for the Symmetric Travelling Salesman Polytope which are defined on two nested sets of vertices of the graph. These inequalities generalize the comb inequalities of Chvítal, Grötschel and Padberg, ...
Clique Tree Inequalities and the Symmetric Travelling Salesman Problem
The linear programming cutting plane approach for solving the travelling salesman problem has recently proven to be highly successful, cf. Crowder and Padberg Crowder, H. P., M. W. Padberg. 1980. Solving large-scale symmetric travelling salesman ...
Facets of the Asymmetric Traveling Salesman Polytope
In this paper we consider the Asymmetric Traveling Salesman polytope, Pn, defined as the convex hull of the incidence vectors of tours in a complete digraph with n vertices, and its monotonization, P̃n. Several classes of valid inequalities for both ...
Comments