Cache behavior of network protocols

Authors:
Erich Nahum

Department of Computer Science, University of Massachusetts, Amherst, MA

Department of Computer Science, University of Massachusetts, Amherst, MA
View Profile

,
David Yates

Department of Computer Science, University of Massachusetts, Amherst, MA

Department of Computer Science, University of Massachusetts, Amherst, MA
View Profile

,
Jim Kurose

Department of Computer Science, University of Massachusetts, Amherst, MA

Department of Computer Science, University of Massachusetts, Amherst, MA
View Profile

,
Don Towsley

Department of Computer Science, University of Massachusetts, Amherst, MA

Department of Computer Science, University of Massachusetts, Amherst, MA
View Profile

Authors Info & Claims

ACM SIGMETRICS Performance Evaluation Review Volume 25 Issue 1June 1997pp 169–180https://doi.org/10.1145/258623.258686

Published:01 June 1997Publication History

ACM SIGMETRICS Performance Evaluation Review

Abstract

In this paper we present a performance study of memory reference behavior in network protocol processing, using an Internet-based protocol stack implemented in the x-kernel running in user space on a MIPS R4400-based Silicon Graphics machine. We use the protocols to drive a validated execution-driven architectural simulator of our machine. We characterize the behavior of network protocol processing, deriving statistics such as cache miss rates and percentage of time spent waiting for memory. We also determine how sensitive protocol processing is to the architectural environment, varying factors such as cache size and associativity, and predict performance on future machines.We show that network protocol cache behavior varies widely, with miss rates ranging from 0 to 28 percent, depending on the scenario. We find instruction cache behavior has the greatest effect on protocol latency under most cases, and that cold cache behavior is very different from warm cache behavior. We demonstrate the upper bounds on performance that can be expected by improving memory behavior, and the impact of features such as associativity and larger cache sizes. In particular, we find that TCP is more sensitive to cache behavior than UDP, gaining larger benefits from improved associativity and bigger caches. We predict that network protocols will scale well with CPU speeds in the future.

References

1 Jean-Loup Baer and Wen-Hann Wang. On the inclusion property for multi-level cache hierarchies. In Proceedings 15th International Symposiumon ComputerArchitecture, pages 73-80, Honolulu Hawaii, June 1988.]] Google ScholarDigital Library
2 David Banks and Michael Prudence. A high-performance network architecture for a PA-RISC workstation. 1EEE Journal on Selected Areas in Communications, 11 (2):191-202, February 1993.]]Google Scholar
3 Robert C. Bedichek. Talisman: Fast and accurate multicomputer simulation. In Proceedings of the ACM Sigmetrics Conference on Measurement and Modeling of Computer Systems, pages 14-24, Ottawa, Canada, May 1995.]] Google ScholarDigital Library
4 Mats BjtJrkman and Per Gunningberg. Locking effects in multiprocessor implementations of protocols. In ACM SIGCOMM Symposium on Communications Architectures and Protocols, pages 74-83, San Francisco, CA, September 1993.]] Google ScholarDigital Library
5 Trevor Blackwell. Speeding up protocols for small messages. In A CM SIGCOMM Symposium on Communications Architectures and Protocols, Stanford, CA, August 1996.]] Google ScholarDigital Library
6 Matthias A. Blumrich, Cezary Dubnicki, Edward W. Felton, Kai Li, and Malena R. Mesarina. Virtual-memory mapped interfaces. IEEE Micro, 15( 1 ):21-28, February 1995.]] Google ScholarDigital Library
7 D. Borman, R. Braden, and V. Jacobson. TCP extensions for high performance. Request for Comments (Proposed Standard) RFC 1323, Internet Engineering Task Force, May 1992.]]Google Scholar
8 Brad Calder, Dirk Grunwald, and Joel Emer. A system level perspective on branch architecture performance. In Proceedings of the 28th Annual IEEE/ACM International Symposium on Microarchitecture, pages 199-206, Ann Arbor, MI, November 1995.]] Google ScholarDigital Library
9 Hsiao-Keng Jerry Chu. Zero copy TCP in Solaris. In Proceedings of the Winter USENIX Technical Conference, San Diego, CA, January 1996.]] Google ScholarDigital Library
10 David D. Clark, Van Jacobson, John Romkey, and Howard Salwen. An analysis of TCP processing overhead. IEEE Communications Magazine, 27(6):23-29, June 1989.]]Google ScholarDigital Library
11 Chris Dalton, Greg Watson, David Banks, Costas Clamvokis. Aled Edwards, and John Lumley. Afterburner. IEEE Netw#rk, I 1(2):36-43. July 1993.]]Google Scholar
12 Amer Diwan, David Tarditi, and Eliot Moss, Memory-system performance of programs with intensive heap allocation. A CM Transactions on Computer Systems, 13(3):244-273, 1995.]] Google ScholarDigital Library
13 Peter Druschel, Larry Peterson, and Bruce Davie. Experiences with a high-speed network adaptor: A software perspective. In ACM SIG- COMM Symposium on C#mmunications Architectures and Prou#cols, London, England, August 1994.]] Google ScholarDigital Library
14 Peter Druschel and Larry L. Peterson. Fbufs: A high-bandwidth crossdomain transfer facility. In Pivceedings of the Fourteenth ACM Symposium on Operating Systems Principles, pages 189-202, Asheville, NC, Dec 1993.]] Google ScholarDigital Library
15 Aled Edwards and Steve Muir. Experiences implementing a highperformance TCP in user space. In ACM SIGCOMM Symposium on Communications Architectures and Protocols, pages 196-205, Cambridge, MA, August 1995.]] Google ScholarDigital Library
16 Murray W. Goldberg, Gerald W. Neufeld, and Mabo R. Ito. A parallel approach to OSI connection-oriented protocols. Third IFIP WG6.1AVG6.4 International Workshop on Protocols for High-Speed Networks, pages 219-232, May 1993.]] Google ScholarDigital Library
17 John L. Hennessy and David A. Patterson. Computer Architecture: A Quantitative Approach (2nd Edition). Morgan Kaufmann Publishers Inc., San Francisco, CA, 1995.]] Google ScholarDigital Library
18 Mark D. Hill. A case for direct mapped caches. IEEE Computer, 21 ( 12):24-40, December 1988.]] Google ScholarDigital Library
19 Mark D. Hill and Alan .1. Smith. Evaluating associativity in CPU caches. IEEE Transactions on Computers, 38(12):1612-1630, December 1989.]] Google ScholarDigital Library
20 Norman C. Hutchinson and Larry L. Peterson. The x-Kernel: An architecture for implementing network protocols. IEEE Transactions on Soil, are Engineering, 17( 1 ):64-76, January 1991.]] Google ScholarDigital Library
21 Van Jacobson. Efficient protocol implementation. In ACM SIGCOMM 1990 Tutorial Notes, Philadelphia, PA, September 1990.]]Google Scholar
22 Van Jacobson. A high performance TCP/IP implementation, in NRI Gigabit TCP Workshop, Reston, VA, March 1993.]]Google Scholar
23 Jonathan Kay and Joseph Pasquale. Measurement, analysis, and improvement of UDP/IP throughput for the DECStation 5000. in USENIX Winter 1993 Technical Conference, pages 249-258, San Diego, CA, 1993.]]Google Scholar
24 S. J. Lefller, M.K. McKusick, M.J. Karels, and J.S. Quarterman. The Design and Implementation of the 4.3BSD UNIX Operating System. Addison-Wesley, 1989.]] Google ScholarDigital Library
25 Larry McVoy and Carl Staelin. LMBENCH: Portable tools for performanee analysis, in USENIX Technical Conference of UNIX and Advanced Computing Systems, San Diego, CA, January 1996.]] Google ScholarDigital Library
26 Ron Minnich, Dan Bums, and Frank Hady. The memory-integrated network interface. IEEE Micro, 15( 1):11-20, February 1995.]] Google ScholarDigital Library
27 David Mosberger, Larry L. Peterson, Patrick G. Bridges, and Scan O'- Malley. Analysis of techniques to improve protocol processing latency. In A CM SIGCOMM Symposium on Communications Architectures and Protocols, Stanford, CA, August 1996.]] Google ScholarDigital Library
28 B.J. Murphy, S. Zeadally, and C.J. Adams. An analysis of process and memory models to support high-speed networking in a UNIX environment. In Proceedings oj the Winter USENIX Technical Conference, San Diego, CA, January 1996.]] Google ScholarDigital Library
29 Erich M. Nahum. Validating an architectural simulator. Technical Report 96-40, Department of Computer Science, University of Massachusetts at Amherst, September 1996.]] Google ScholarDigital Library
30 Erich M Nahum, David J. Yates, James E Kurose. and Don Towsley. Performance issues in parallelized network protocols. In First USENIX Symposium on Operating Systems Design and Implementation, pages 125-137, Monterey, CA, November 1994.]] Google ScholarDigital Library
31 Karl Pettis and Robert C. Hansen. Profile guided code positioning. In ACM SIGPLAN '90 Conference on Programming Language Design and Implementation (PLDI), pages 16-27, White Plains, NY, June 1990.]] Google ScholarDigital Library
32 Mendel Rosenblum, Edouard Bugnion, Stephen A. Herrod, Emmett Witchell, and Anoop Gupta. The impact of computer architecture on operating system performance. In Proceedings of the Fifteenth ACM Symposium on Operating Systems Principles, Copper Canyon, CO, December 1995.]] Google ScholarDigital Library
33 James D. Salehi, James E Kurose, and Don Towsley. The effectiveness of affinity-based scheduling in multiprocessor network protocol processing. IEEE/ACM Transactions on Networking, 4(4):516-530, August 1996.]] Google ScholarDigital Library
34 Douglas C. Schmidt and Tatsuya Suda. Measuring the performance of parallel message-based process architectures. In Proceedings of'the Conference on Computer Communications (IEEE Infocom), Boston, MA, April 1995.]] Google ScholarDigital Library
35 Silicon Graphics Inc. Cord manual page, IRIX 5.3.]]Google Scholar
36 Michael D. Smith. Tracing with Pixie. Technical report, Center for Integrated Systems, Stanford University, Stanford, CA, April 1991.]]Google Scholar
37 Steven E. Speer, Rajiv Kumar, and Craig Partridge. Improving UNIX kernel performance using profile based optimization. In Proceedings of the Winter 1994 USENIX Conference, pages 181-188, San Francisco, CA, January 1994.]] Google ScholarDigital Library
38 Jack E. Veenstra and Robert J. Fowler. MINT: A front end for efficient simulation of shared-memory mulUprocessors. In Proceedings 2nd International Workshop on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS), Durham, NC, January 1994.]] Google ScholarDigital Library
39 David J. Yates, Erich M. Nahum, James E Kurose, and Don Towsley. Networking support for large scale multiprocessor servers. In Proceedings of the A CM Sigmetrics Conference on Measurement and Modeling of Computer Systems, Philadelphia, Pennsylvania, May 1996.]] Google ScholarDigital Library

Index Terms

Cache behavior of network protocols

Recommendations

Cache behavior of network protocols
SIGMETRICS '97: Proceedings of the 1997 ACM SIGMETRICS international conference on Measurement and modeling of computer systems

In this paper we present a performance study of memory reference behavior in network protocol processing, using an Internet-based protocol stack implemented in the x-kernel running in user space on a MIPS R4400-based Silicon Graphics machine. We use the ...
Read More
Cache miss behavior: is it √2?
CF '06: Proceedings of the 3rd conference on Computing frontiers

It has long been empirically observed that the cache miss rate decreased as a power law of cache size, where the power was approximately -1/2. In this paper, we examine the dependence of the cache miss rate on cache size both theoretically and through ...
Read More
Simulation based Performance Study of Cache Coherence Protocols
INIS '15: Proceedings of the 2015 IEEE International Symposium on Nanoelectronic and Information Systems (iNIS)

Cache coherence protocol maintains data consistency between different cores / processors in a shared memory multi-core (MC) / multi-processor (MP) system. Coherency can be achieved at the cost of increased miss rate because of invalidations. Coherency ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM SIGMETRICS Performance Evaluation Review Volume 25, Issue 1
June 1997
298 pages
ISSN:0163-5999
DOI:10.1145/258623
Chairmen:
John Zahorjan
Univ. of Washington, Seattle
,
Albert Greenberg
AT&T Research
,
Editor:
Scott Leutenegger
Univ. of Denver, Denver, CO
Issue’s Table of Contents
SIGMETRICS '97: Proceedings of the 1997 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
June 1997
302 pages
ISBN:0897919092
DOI:10.1145/258612
Chairmen:
John Zahorjan
Univ. of Washington, Seattle
,
Albert Greenberg
AT&T Research
,
Editor:
Scott Leutenegger
Univ. of Denver, Denver, CO
Copyright © 1997 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 1 June 1997
Check for updates
Qualifiers
- article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 17
  Total Citations
  View Citations
- 533
  Total Downloads
- Downloads (Last 12 months)47
- Downloads (Last 6 weeks)3
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Cache behavior of network protocols

ACM SIGMETRICS Performance Evaluation Review

Abstract

References

Cited By

Index Terms

Recommendations

Cache behavior of network protocols

Cache miss behavior: is it √2?

Simulation based Performance Study of Cache Coherence Protocols