skip to main content
article
Free Access

The fault span of crash failures

Published:01 March 2000Publication History
Skip Abstract Section

Abstract

A crashing network protocol is an asynchronous protocol whose memory does not survive crashes. We show that a crashing network protocol that works over unreliable links can be driven to arbitrary global states, where each node is in a state reached in some (possibly different) execution, and each link has an arbitrary mixture of packets sent in (possibly different) executions. Our theorem considerably generalizes an earlier result, due to Fekete et al., which states that there is no correct crashing Data Link Protocol. For example, we prove that there is no correct crashing protocol for token passing and for many other resource allocation protocols such as k-exclusion, and the drinking and dining philosophers problems. We further characterize the reachable states caused by crash failures using reliable non-FIFO and reliable FIFO links. We show that with reliable non-FIFO links any acyclic subset of nodes and links can be driven to arbitrary states. We show that with reliable FIFO links, only nodes can be driven to arbitrary states. Overall, we show a strict hierarchy in terms of the set of states reachable by crash failures in the three link models.

References

  1. AFEK, Y., AWERBUCH, B., AND GAFNI, E. 1987. Applying static network protocols to dynamic networks. In Proceedings of the 28th IEEE Symposium on Foundations of Computer Science (Oct.). IEEE Computer Society Press, Los Alamitos, Calif. pp. 358-370.Google ScholarGoogle Scholar
  2. AFEK, Y., AND BROWN, G.M. 1993. Self-stabilization over unreliable communication media. Distr. Comput. 7, 1, 27-34. Google ScholarGoogle Scholar
  3. ATTIYA, H., DOLEV, S., AND WELCH, J. L. 1995. Connection management without retaining information. Inf. Comput. 123, 2, (Dec.), 155-171. Google ScholarGoogle Scholar
  4. BARATZ, A., AND SEGALL, A. 1988. Reliable link initialization procedures. IEEE Trans. Commun. (Feb.), 144-153.Google ScholarGoogle Scholar
  5. DIGITAL EQUIPMENT CORPORATION. 1983. Phase IV NSP Functional Specification. Digital Order Number AA-X439A-TK.Google ScholarGoogle Scholar
  6. FEKETE, A., LYNCH, N. A., MANSOUR, Y., AND SPINELLI, J. 1993. The impossibility of implementing reliable communication in the face of crashes. J. ACM, 40, 5 (Nov.). Google ScholarGoogle Scholar
  7. FINN, S. C. 1979. Resynch procedures and a fail-safe network protocol. IEEE Trans. Commun. COM-27, 6 (June), 840-845.Google ScholarGoogle Scholar
  8. JAYARAM, M. 1996. Fault span of crash failures. M.S. Thesis, Washington Univ. St. Louis, MO.Google ScholarGoogle Scholar
  9. JAYARAM, M., AND VARGHESE, G. 1997. The complexity of crash failures. In Proceedings of the 16th Annual ACM Symposium on Principles of Distributed Computing (Santa Barbara, Calif., Aug. 21-24). ACM, New York, 179-188. Google ScholarGoogle Scholar
  10. LYNCH, N. A., AND TUTTLE, M.R. 1989. An introduction to input/output automata. CWI Quarterly 2, 3, 219-246.Google ScholarGoogle Scholar
  11. LYNCH, N.A. 1996. Distributed Algorithms. Morgan-Kaufman, San Francisco, Calif. Google ScholarGoogle Scholar
  12. McQUILLAN, J. M., RICHER, I., AND ROSEN, E. C. 1980. The new routing algorithm for the arpanet. IEEE Trans. Commun. COM-28, 5 (May), 711-719.Google ScholarGoogle Scholar
  13. TANNENBAUM, A. 1996. Computer Networks, 3rd ed. Prentice-Hall, Upper Saddle River, N.J. Google ScholarGoogle Scholar
  14. WATSON, R.W. 1981. Timer based mechanisms in reliable transport protocol connection management. Comput. Netw. 5 (Feb.), 47-56.Google ScholarGoogle Scholar

Index Terms

  1. The fault span of crash failures

                  Recommendations

                  Comments

                  Login options

                  Check if you have access through your login credentials or your institution to get full access on this article.

                  Sign in

                  Full Access

                  • Published in

                    cover image Journal of the ACM
                    Journal of the ACM  Volume 47, Issue 2
                    March 2000
                    192 pages
                    ISSN:0004-5411
                    EISSN:1557-735X
                    DOI:10.1145/333979
                    Issue’s Table of Contents

                    Copyright © 2000 ACM

                    Publisher

                    Association for Computing Machinery

                    New York, NY, United States

                    Publication History

                    • Published: 1 March 2000
                    Published in jacm Volume 47, Issue 2

                    Permissions

                    Request permissions about this article.

                    Request Permissions

                    Check for updates

                    Qualifiers

                    • article

                  PDF Format

                  View or Download as a PDF file.

                  PDF

                  eReader

                  View online with eReader.

                  eReader