Abstract
A crashing network protocol is an asynchronous protocol whose memory does not survive crashes. We show that a crashing network protocol that works over unreliable links can be driven to arbitrary global states, where each node is in a state reached in some (possibly different) execution, and each link has an arbitrary mixture of packets sent in (possibly different) executions. Our theorem considerably generalizes an earlier result, due to Fekete et al., which states that there is no correct crashing Data Link Protocol. For example, we prove that there is no correct crashing protocol for token passing and for many other resource allocation protocols such as k-exclusion, and the drinking and dining philosophers problems. We further characterize the reachable states caused by crash failures using reliable non-FIFO and reliable FIFO links. We show that with reliable non-FIFO links any acyclic subset of nodes and links can be driven to arbitrary states. We show that with reliable FIFO links, only nodes can be driven to arbitrary states. Overall, we show a strict hierarchy in terms of the set of states reachable by crash failures in the three link models.
- AFEK, Y., AWERBUCH, B., AND GAFNI, E. 1987. Applying static network protocols to dynamic networks. In Proceedings of the 28th IEEE Symposium on Foundations of Computer Science (Oct.). IEEE Computer Society Press, Los Alamitos, Calif. pp. 358-370.Google Scholar
- AFEK, Y., AND BROWN, G.M. 1993. Self-stabilization over unreliable communication media. Distr. Comput. 7, 1, 27-34. Google Scholar
- ATTIYA, H., DOLEV, S., AND WELCH, J. L. 1995. Connection management without retaining information. Inf. Comput. 123, 2, (Dec.), 155-171. Google Scholar
- BARATZ, A., AND SEGALL, A. 1988. Reliable link initialization procedures. IEEE Trans. Commun. (Feb.), 144-153.Google Scholar
- DIGITAL EQUIPMENT CORPORATION. 1983. Phase IV NSP Functional Specification. Digital Order Number AA-X439A-TK.Google Scholar
- FEKETE, A., LYNCH, N. A., MANSOUR, Y., AND SPINELLI, J. 1993. The impossibility of implementing reliable communication in the face of crashes. J. ACM, 40, 5 (Nov.). Google Scholar
- FINN, S. C. 1979. Resynch procedures and a fail-safe network protocol. IEEE Trans. Commun. COM-27, 6 (June), 840-845.Google Scholar
- JAYARAM, M. 1996. Fault span of crash failures. M.S. Thesis, Washington Univ. St. Louis, MO.Google Scholar
- JAYARAM, M., AND VARGHESE, G. 1997. The complexity of crash failures. In Proceedings of the 16th Annual ACM Symposium on Principles of Distributed Computing (Santa Barbara, Calif., Aug. 21-24). ACM, New York, 179-188. Google Scholar
- LYNCH, N. A., AND TUTTLE, M.R. 1989. An introduction to input/output automata. CWI Quarterly 2, 3, 219-246.Google Scholar
- LYNCH, N.A. 1996. Distributed Algorithms. Morgan-Kaufman, San Francisco, Calif. Google Scholar
- McQUILLAN, J. M., RICHER, I., AND ROSEN, E. C. 1980. The new routing algorithm for the arpanet. IEEE Trans. Commun. COM-28, 5 (May), 711-719.Google Scholar
- TANNENBAUM, A. 1996. Computer Networks, 3rd ed. Prentice-Hall, Upper Saddle River, N.J. Google Scholar
- WATSON, R.W. 1981. Timer based mechanisms in reliable transport protocol connection management. Comput. Netw. 5 (Feb.), 47-56.Google Scholar
Index Terms
- The fault span of crash failures
Recommendations
Contention-related crash failures: Definitions, agreement algorithms, and impossibility results
AbstractThis article explores an interplay between process crash failures and concurrency. Namely, it aims at answering the question, “Is it possible to cope with more crash failures when some number of crashes occur before some predefined ...
Consensus in anonymous asynchronous systems with crash-recovery and omission failures
AbstractIn anonymous distributed systems, processes are indistinguishable because they have no identity and execute the same algorithm. Currently, anonymous systems are receiving a lot of attention mainly because they preserve privacy, which is an ...
Comments