ABSTRACT
A large amount of popular content is transferred repeatedly across network links in the Internet. In recent years, protocol-independent redundancy elimination, which can remove duplicate strings from within arbitrary network flows, has emerged as a powerful technique to improve the efficiency of network links in the face of repeated data. Many vendors offer such redundancy elimination middleboxes to improve the effective bandwidth of enterprise, data center and ISP links alike.
In this paper, we conduct a large scale trace-driven study of protocol independent redundancy elimination mechanisms, driven by several terabytes of packet payload traces collected at 12 distinct network locations, including the access link of a large US-based university and of 11 enterprise networks of different sizes. Based on extensive analysis, we present a number of findings on the benefits and fundamental design issues in redundancy elimination systems. Two of our key findings are (1) A new redundancy elimination algorithm based on Winnowing that outperforms the widely-used Rabin fingerprint-based algorithm by 5-10% on most traces and by as much as 35% in some traces. (2) A surprising finding that 75-90% of middlebox's bandwidth savings in our enterprise traces is due to redundant byte-strings from within each client's traffic, implying that pushing redundancy elimination capability to the end hosts, i.e. an end-to-end redundancy elimination solution, could obtain most of the middlebox's bandwidth savings.
- Citrix, application delivery infrastructure. http://www.citrix.com/.Google Scholar
- Computerworld -- WAN optimization continues growth. www.computerworld.com.au/index.php/id;1174462047;fp;16;fpid;0/.Google Scholar
- F5 Networks: WAN Delivery Products. http://www.f5.com/.Google Scholar
- Netequalizer Bandwidth Shaper. http://www.netequalizer.com/.Google Scholar
- Packeteer WAN optimization solutions. http://www.packeteer.com/.Google Scholar
- PeerApp: P2P and Media Caching. http://www.peerapp.com.Google Scholar
- Peribit Networks (Acquired by Juniper in 2005): WAN Optimization Solution. http://www.juniper.net/.Google Scholar
- Riverbed Networks: WAN Optimization. http://www.riverbed.com/solutions/optimize/.Google Scholar
- WAN optimization revenues grow 16% -- IT Facts. www.itfacts.biz/wan-optimization--market-to-grow-16/1205/.Google Scholar
- WAN Optimization: Wikipedia entry. http://en.wikipedia.org/wiki/WAN_Optimization.Google Scholar
- P. Abry and D. Veitch.Wavelet analysis of long-range dependent traffic. IEEE Transactions on Information Theory, 44(1):2--15, Jan 1998. Google ScholarDigital Library
- A. Anand, A. Gupta, A. Akella, S. Seshan, and S. Shenker. Packet Caches on Routers: The Implications of Universal Redundant Traffic Elimination. In ACM SIGCOMM, Seattle, WA, Aug. 2008. Google ScholarDigital Library
- N. Bjorner, A. Blass, and Y. Gurevich. Content-Dependent Chunking for Differential Compression, the Local Maximum Approach. Technical Report 109, Microsoft Research, July 2007.Google Scholar
- L. Breslau, P. Cao, L. Fan, G. Phillips, and S. Shenker. Web caching and zipf-like distributions: Evidence and implications. In IEEE Infocom, 1999.Google ScholarCross Ref
- M. Burrows and D. J.Wheeler. A block-sorting lossless data compression algorithm. Technical report, Digital SRC Research Report, 1994.Google Scholar
- F. Dogar, A. Phanishayee, H. Pucha, O. Ruwase, and D. Andersen. Ditto -- A System for Opportunistic Caching in Multi-hop Wireless Mesh Networks. In Proc. ACM Mobicom, San Francisco, CA, Sept. 2008. Google ScholarDigital Library
- L. Fan, P. Cao, J. Almeida, and A. Z. Broder. Summary cache: a scalable wide-area web cache sharing protocol. In SIGCOMM '98, 1998. Google ScholarDigital Library
- D. Goldenberg, L. Qiu, H. Xie, Y. Yang, and Y. Zhang. Optimizing cost and performance for multihoming. In ACM SIGCOMM, 2004. Google ScholarDigital Library
- X. Li, D. Salyers, and A. Striegel. Improving packet caching scalability through the concept of an explicit end of data marker. In HotWeb, 2006.Google ScholarCross Ref
- U. Manber. Finding similar files in a large file system. In USENIX Winter Technical Conference, 1994. Google ScholarDigital Library
- A. Muthitacharoen, B. Chen, and D. Mazières. A low-bandwidth network file system. SIGOPS Oper. Syst. Rev., 35(5), 2001. Google ScholarDigital Library
- H. Pucha, D. G. Andersen, and M. Kaminsky. Exploiting similarity for multi-source downloads using file handprints. In Proc. 4th USENIX NSDI, Cambridge, MA, Apr. 2007. Google ScholarDigital Library
- M. Rabin. Fingerprinting by random polynomials. Technical report, Harvard University, 1981. Technical Report, TR-15-81.Google Scholar
- RouteScience Technologies, Inc. Routescience PathControl. http://www.routescience.com/products.Google Scholar
- S. Schleimer, D. Wilkerson, and A. Aiken. Winnowing: Local algorithms for document fingerprinting. In SIGMOD, 2003. Google ScholarDigital Library
- N. T. Spring and D. Wetherall. A protocol-independent technique for eliminating redundant network traffic. In SIGCOMM, pages 87--95, 2000. Google ScholarDigital Library
- Squid Web Proxy Cache. http://www.squid--cache.org/.Google Scholar
- A. Wolman et al. On the scale and performance of cooperative Web proxy caching. In ACM Symposium on Operating Systems Principles, 1999. Google ScholarDigital Library
- A. Wolman et al. Organization-based Analysis of Web-Object Sharing and Caching. In Proceedings of the 2nd USITS, Oct 1999. Google ScholarDigital Library
- J. Ziv and A. Lempel. A universal algorithm for sequential data compression. Information Theory, IEEE Transactions on, 23(3):337--343, 1977.Google ScholarDigital Library
Index Terms
- Redundancy in network traffic: findings and implications
Recommendations
Packet caches on routers: the implications of universal redundant traffic elimination
SIGCOMM '08: Proceedings of the ACM SIGCOMM 2008 conference on Data communicationMany past systems have explored how to eliminate redundant transfers from network links and improve network efficiency. Several of these systems operate at the application layer, while the more recent systems operate on individual packets. A common ...
Redundancy in network traffic: findings and implications
SIGMETRICS '09A large amount of popular content is transferred repeatedly across network links in the Internet. In recent years, protocol-independent redundancy elimination, which can remove duplicate strings from within arbitrary network flows, has emerged as a ...
Packet caches on routers: the implications of universal redundant traffic elimination
Many past systems have explored how to eliminate redundant transfers from network links and improve network efficiency. Several of these systems operate at the application layer, while the more recent systems operate on individual packets. A common ...
Comments