Abstract
Empirical evidence suggests that reactive routing systems improve resilience to Internet path failures. They detect and route around faulty paths based on measurements of path performance. This paper seeks to understand why and under what circumstances these techniques are effective.To do so, this paper correlates end-to-end active probing experiments, loss-triggered traceroutes of Internet paths, and BGP routing messages. These correlations shed light on three questions about Internet path failures: (1) Where do failures appear? (2) How long do they last? (3) How do they correlate with BGP routing instability?Data collected over 13 months from an Internet testbed of 31 topologically diverse hosts suggests that most path failures last less than fifteen minutes. Failures that appear in the network core correlate better with BGP instability than failures that appear close to end hosts. On average, most failures precede BGP messages by about four minutes, but there is often increased BGP traffic both before and after failures. Our findings suggest that reactive routing is most effective between hosts that have multiple connections to the Internet. The data set also suggests that passive observations of BGP routing messages could be used to predict about 20% of impending failures, allowing re-routing systems to react more quickly to failures.
- Amini, L., Shaikh, A., and Schulzrinne, H. Issues with inferring Internet topological attributes. In Proc. SPIE ITCOM (Boston, MA, August 2002), vol. 4685, pp. 80--90.Google Scholar
- Andersen, D. G., Balakrishnan, H., Kaashoek, M. F., and Morris, R. Resilient Overlay Networks. In Proc. 18th ACM SOSP(Banff, Canada, Oct. 2001), pp. 131--145. Google ScholarDigital Library
- Bremler-Barr, A., Cohen, E., Kaplan, H., and Mansour, Y. Predicting and bypassing end-to-end Internet service degradations. In Proc. ACM SIGCOMM Internet Measurement Workshop (Marseille, France, November 2002). Google ScholarDigital Library
- CAIDA's Skitter project, 2002. http://www.caida.org/tools/measurement/skitter/.Google Scholar
- Chandra, B., Dahlin, M., Gao, L., and Nayate, A. End-to-end WAN Service Availability. In Proc. 3rd USITS (San Francisco, CA, 2001), pp. 97--108. Google ScholarDigital Library
- Chang, D.-F., Govindan, R., and Heidemann, J. An empirical study of router response to large BGP routing table load. Tech. Rep. ISI-TR-2001-552, USC/Information Sciences Institute, December 2001.Google Scholar
- Donelan, S. Update: CSX train derailment. http://www.merit.edu/mail.archives/nanog/2001-07/msg00351.html.Google Scholar
- Egan, J. Signal Detection Theory and ROC Analysis. Academic Press, New York, 1975.Google Scholar
- Freedman, A. Active UDP and TCP performance during BGP update activity. In Proc. Internet Statistics Metrics and Analysis Workshop (Leiden, The Netherlands, October 2002). http://www.caida.org/outreach/isma/0210/ISMAagenda.xml.Google Scholar
- Gao, L. On inferring automonous system relationships in the Internet. IEEE/ACM Transactions on Networking 9, 6 (December 2001), 733--745. Google ScholarDigital Library
- Labovitz, C., Ahuja, A., Bose, A., and Jahanian, F. Delayed Internet Routing Convergence. IEEE/ACM Transactions on Networking 9, 3 (June 2001), 293--306. Google ScholarDigital Library
- Labovitz, C., Ahuja, A., and Jahanian, F. Experimental Study of Internet Stability and Wide-Area Backbone Failures. In Proc. 29th International Symposium on Fault-Tolerant Computing (June 1999). Google ScholarDigital Library
- Mahajan, R., Wetherall, D., and Anderson, T. Understanding BGP misconfiguration. In Proc. ACM SIGCOMM (Aug. 2002). (to appear) http://www.cs.washington.edu/homes/ratul/bgp/bgp-misconfigs.ps. Google ScholarDigital Library
- Mao, Z. M., Govindan, R., Varghese, G., and Katz, R. Route Flap Damping Exacerbates Internet Routing Convergence. In Prof. ACM SIGCOMM 2002 (Pittsburgh, PA, August 2002). Google ScholarDigital Library
- Miller, G. Overlay routing networks (akarouting), Apr. 2002.Google Scholar
- Nichol, D. Detecting behavior propagation in BGP trace data. In Proc. Internet Statistics Metrics and Analysis Workshop (Leiden, The Netherlands, October 2002). http://www.caida.org/outreach/isma/0210/talks/david.pdf.Google Scholar
- Opnix. Orbit: Routing Intelligence System. http://www.opnix.com/newsroom/OrbitWhitePaper_July_2001.pdf, 2002.Google Scholar
- Paxson, V. End-to-End Routing Behavior in the Internet. IEEE/ACM Transactions on Networking 5, 5 (1997), 601--615. Google ScholarDigital Library
- MIT RON Project. http://nms.lcs.mit.edu/ron/.Google Scholar
- RouteScience. http://www.routescience.com/.Google Scholar
- Sockeye. http://www.sockeye.com/.Google Scholar
- Spring, N., Mahajan, R., and Wetherall, D. Measuring ISP topologies with Rocketfuel. In Proc. ACM SIGCOMM (Aug. 2002). Google ScholarDigital Library
- Wang, L., et al. Observation and analysis of BGP behavior under stress. In Proc. ACM SIGCOMM Internet Measurement Workshop (Marseille, France, November 2002). Google ScholarDigital Library
- Gnu Zebra. http://www.zebra.org/.Google Scholar
- Zhang, Y., Duffield, N., Paxson, V., and Shenker, S. On the constancy of Internet path properties. In Proc. ACM SIGCOMM Internet Measurement Workshop (San Francisco, CA, November 2001). Google ScholarDigital Library
Index Terms
- Measuring the effects of internet path faults on reactive routing
Recommendations
Delayed Internet routing convergence
This paper examines the latency in Internet path failure, failover and repair due to the convergence properties of inter-domain routing. Unlike switches in the public telephony network which exhibit failover on the order of milliseconds, our ...
Measuring the effects of internet path faults on reactive routing
SIGMETRICS '03: Proceedings of the 2003 ACM SIGMETRICS international conference on Measurement and modeling of computer systemsEmpirical evidence suggests that reactive routing systems improve resilience to Internet path failures. They detect and route around faulty paths based on measurements of path performance. This paper seeks to understand why and under what circumstances ...
An active approach to measuring routing dynamics induced by autonomous systems
ExpCS '07: Proceedings of the 2007 workshop on Experimental computer scienceWe present an active measurement study of the routing dynamics induced by AS-path prepending, a common method for controlling the inbound traffic of a multi-homed ISP. Unlike other inter-domain inbound traffic engineering methods, AS-path prepending not ...
Comments