ABSTRACT
This paper presents Capriccio, a scalable thread package for use with high-concurrency servers. While recent work has advocated event-based systems, we believe that thread-based systems can provide a simpler programming model that achieves equivalent or superior performance.By implementing Capriccio as a user-level thread package, we have decoupled the thread package implementation from the underlying operating system. As a result, we can take advantage of cooperative threading, new asynchronous I/O mechanisms, and compiler support. Using this approach, we are able to provide three key features: (1) scalability to 100,000 threads, (2) efficient stack management, and (3) resource-aware scheduling.We introduce linked stack management, which minimizes the amount of wasted stack space by providing safe, small, and non-contiguous stacks that can grow or shrink at run time. A compiler analysis makes our stack implementation efficient and sound. We also present resource-aware scheduling, which allows thread scheduling and admission control to adapt to the system's current resource usage. This technique uses a blocking graph that is automatically derived from the application to describe the flow of control between blocking points in a cooperative thread package. We have applied our techniques to the Apache 2.0.44 web server, demonstrating that we can achieve high performance and scalability despite using a simple threaded programming model.
- A. Adya, J. Howell, M. Theimer, W. J. Bolosky, and J. R. Douceur. Cooperative task management without manual stack management. In Proceedings of the 2002 Usenix ATC, June 2002. Google ScholarDigital Library
- T. E. Anderson, B. N. Bershad, E. D. Lazowska, and H. M. Levy. Scheduler Activations: Effective Kernel Support for the User-Level Management of Parallelism. ACM Transactions on Computer Systems, 10(1):53--79, February 1992. Google ScholarDigital Library
- A. W. Appel and D. B. MacQueen. Standard ML of New Jersey. In Proceedings of the 3rd International Symposium on Programming Language Implementation and Logic Programming, pages 1--13, 1991.Google ScholarCross Ref
- A. W. Appel and Z. Shao. An empirical and analytic study of stack vs. heap cost for languages with closures. Journal of Functional Programming, 6(1):47--74, Jan 1996.Google ScholarCross Ref
- B. N. Bershad, C. Chambers, S. J. Eggers, C. Maeda, D. McNamee, P. Pardyak, S. Savage, and E. G. Sirer. SPIN - an extensible microkernel for application-specific operating system services. In ACM SIGOPS European Workshop, pages 68--71, 1994. Google ScholarDigital Library
- M. W. Blasgen, J. Gray, M. F. Mitoma, and T. G. Price. The convoy phenomenon. Operating Systems Review, 13(2):20--25, 1979. Google ScholarDigital Library
- R. D. Blumofe, C. F. Joerg, B. C. Kuszmaul, C. E. Leiserson, K. H. Randall, and Y. Zhou. Cilk: An efficient multithreaded runtime system. Journal of Parallel and Distributed Computing, 37(1):55--69, 1996. Google ScholarDigital Library
- D. G. Bobrow and B. Wegbreit. A model and stack implementation of multiple environments. Communications of the ACM, 16(10):591--603, Oct 1973. Google ScholarDigital Library
- M. C. Carlisle, A. Rogers, J. Reppy, and L. Hendren. Early experiences with Olden. In Proceedings of the 6th International Workshop on Languages and Compilers for Parallel Computing (LNCS), 1993. Google ScholarDigital Library
- A. Chankhunthod, P. B. Danzig, C. Neerdaels, M. F. Schwartz, and K. J. Worrell. A Hierarchical Internet Object Cache. In Proceedings of the 1996 Usenix Annual Technical Conference, January 1996. Google ScholarDigital Library
- P. Cheng and G. E. Blelloch. A parallel, real-time garbage collector. In Proceedings of the 2001 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI '01), 2001. Google ScholarDigital Library
- J. Cordina. Fast multithreading on shared memory multiprocessors. Technical report, University of Malta, June 2000.Google Scholar
- D. R. Engler, M. F. Kaashoek, and J. O'Toole. Exokernel: An operating system architecture for application-level resource management. In Symposium on Operating Systems Principles, pages 251--266, 1995. Google ScholarDigital Library
- D. Gay, P. Levis, R. von Behren, M. Welsh, E. Brewer, and D. Culler. The nesC language: A holistic approach to networked embedded systems. In ACM SIGPLAN Conference on Programming Language Design and Implementation, 2003. Google ScholarDigital Library
- S. C. Goldstein, K. E. Schauser, and D. E. Culler. Lazy Threads, Stacklets, and Synchronizers: Enabling primitives for compiling parallel languages. In Third Workshop on Langauges, Compilers, and Run-Time Systems for Scalable Computers, 1995.Google Scholar
- T. Hun. Minimal Context Thread 0.7 manual. http://www.aranetwork.com/docs/mct-manual.pdf, 2002.Google Scholar
- B. LaHaise. Linux AIO home page. http://www.kvack.org/blah/aio/.Google Scholar
- H. C. Lauer and R. M. Needham. On the duality of operating system structures. In Second Inernational Symposium on Operating Systems, IR1A, October 1978.Google Scholar
- J. Lemon. Kqueue: A generic and scalable event notification facility. In USENIX Technical conference, 2001. Google ScholarDigital Library
- D. Libenzi. Linux epoll patch. http://www.xmailserver.org/linux-patches/nio-improve.html.Google Scholar
- D. McNamee and K. Armstrong. Extending the Mach external pager interface to accommodate user-level page replacement policies. Technical Report TR-90-09-05, University of Washington, 1990.Google Scholar
- S. S. Muchnick. Advanced Compiler Design and Implementation. Morgan Kaufmann, San Francisco, 2000. Google ScholarDigital Library
- G. C. Necula, S. McPeak, S. P. Rahul, and W. Weimer. CIL: Intermediate language and tools for analysis and transformation of C programs. Lecture Notes in Computer Science, 2304:213--229, 2002. Google ScholarDigital Library
- G. C. Necula, S. McPeak, and W. Weimer. CCured: Type-safe retrofitting of legacy code. In The 29th Annual ACM Symposium on Principles of Programming Languages, pages 128--139. ACM, Jan. 2002. Google ScholarDigital Library
- J. K. Ousterhout. Why Threads Are A Bad Idea (for most purposes). Presentation given at the 1996 Usenix Annual Technical Conference, January 1996.Google Scholar
- V. S. Pai, P. Druschel, and W. Zwaenepoel. Flash: An Efficient and Portable Web Server. In Proceedings of the 1999 Annual Usenix Technical Conference, June 1999. Google ScholarDigital Library
- H. D. Pande and B. G. Ryder. Data- ow-based virtual function resolution. Lecture Notes in Computer Science, 1145:238--254, 1996. Google ScholarDigital Library
- W. Pang and S. D. Goodwin. An algorithm for solving constraint-satisfaction problems.Google Scholar
- M. I. Seltzer, Y. Endo, C. Small, and K. A. Smith. Dealing with disaster: Surviving misbehaved kernel extensions. In Proceedings of the 2nd Symposium on Operating Systems Design and Implementation, pages 213--227, Seattle, Washington, 1996. Google ScholarDigital Library
- M. A. Shah, S. Madden, M. J. Franklin, and J. M. Hellerstein. Java support for data-intensive systems: Experiences building the Telegraph data ow system. SIGMOD Record, 30(4):103--114, 2001. Google ScholarDigital Library
- O. Shivers. Control-Flow Analysis of Higher-Order Languages. PhD thesis, Carnegie-Mellon University, May 1991. Google ScholarDigital Library
- E. Toernig. Coroutine library source. http://www.goron.de/froese/coro/.Google Scholar
- Unknown. Accellerating Apache project. http://aap.sourceforge.net/.Google Scholar
- Unknown. State threads for Internet applications. http://state-threads.sourceforge.net/docs/st.html.Google Scholar
- J. R. von Behren, E. Brewer, N. Borisov, M. Chen, M. Welsh, J. MacDonald, J. Lau, S. Gribble, , and D. Culler. Ninja: A framework for network services. InProceedings of the 2002 Usenix Annual Technical Conference, June 2002. Google ScholarDigital Library
- R. von Behren, J. Condit, and E. Brewer. Why events are a bad idea (for high-concurrency servers). In Proceedings of the 2003 HotOS Workshop, May 2003. Google ScholarDigital Library
- T. von Eicken, A. Basu, V. Buch, and W. Vogels. U-Net: A User-Level Network Interface for Parallel and Distributed Computing. In Proceedings of the 15th ACM Symposium on Operating Systems Principles, Copper Mountain Resort, CO, USA, Decemeber 1995. Google ScholarDigital Library
- M. Welsh, D. E. Culler, and E. A. Brewer. SEDA: An architecture for well-conditioned, scalable Internet services. In Symposium on Operating Systems Principles, pages 230--243, 2001. Google ScholarDigital Library
Index Terms
- Capriccio: scalable threads for internet services
Recommendations
Capriccio: scalable threads for internet services
SOSP '03This paper presents Capriccio, a scalable thread package for use with high-concurrency servers. While recent work has advocated event-based systems, we believe that thread-based systems can provide a simpler programming model that achieves equivalent or ...
Lightweight preemptive user-level threads
PPoPP '21: Proceedings of the 26th ACM SIGPLAN Symposium on Principles and Practice of Parallel ProgrammingMany-to-many mapping models for user- to kernel-level threads (or "M:N threads") have been extensively studied for decades as a lightweight substitute for current Pthreads implementations that provide a simple one-to-one mapping ("1:1 threads"). M:N ...
Arachne: A Portable Threads System Supporting Migrant Threads on Heterogeneous Network Farms
We present the design and implementation of Arachne, a threads system that can be interfaced with a communications library for multithreaded distributed computations. In particular, Arachne supports thread migration between heterogeneous platforms, ...
Comments