ACM Home Page
Please provide us with feedback. Feedback
Design, implementation, and evaluation of the linear road bnchmark on the stream processing core
Full text PdfPdf (300 KB)
Source International Conference on Management of Data archive
Proceedings of the 2006 ACM SIGMOD international conference on Management of data table of contents
Chicago, IL, USA
SESSION: Data streams table of contents
Pages: 431 - 442  
Year of Publication: 2006
ISBN:1-59593-434-0
Authors
Navendu Jain  University of Texas at Austin, Austin, TX
Lisa Amini  IBM T. J. Watson Research Center, Hawthorne, NY
Henrique Andrade  IBM T. J. Watson Research Center, Hawthorne, NY
Richard King  IBM T. J. Watson Research Center, Hawthorne, NY
Yoonho Park  IBM T. J. Watson Research Center, Hawthorne, NY
Philippe Selo  IBM T. J. Watson Research Center, Hawthorne, NY
Chitra Venkatramani  IBM T. J. Watson Research Center, Hawthorne, NY
Sponsors
ACM: Association for Computing Machinery
SIGMOD: ACM Special Interest Group on Management of Data
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 17,   Downloads (12 Months): 240,   Citation Count: 10
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
Save this Article to a Binder    Display Formats: BibTex  EndNote ACM Ref   
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1142473.1142522
What is a DOI?

ABSTRACT

Stream processing applications have recently gained significant attention in the networking and database community. At the core of these applications is a stream processing engine that performs resource allocation and management to support continuous tracking of queries over collections of physically-distributed and rapidly-updating data streams. While numerous stream processing systems exist, there has been little work on understanding the performance characteristics of these applications in a distributed setup. In this paper, we examine the performance bottlenecks of streaming data applications, in particular the Linear Road stream data management benchmark, in achieving good performance in large-scale distributed environments, using the Stream Processing Core (SPC), a stream processing middleware we have developed. First, we present the design and implementation of the Linear Road benchmark on the SPC middleware. SPC has been designed to scale to tens of thousands of processing nodes, while supporting concurrent applications and multiple simultaneous queries. Second, we identify the main performance bottlenecks in the Linear Road application in achieving scalability and low query response latency. Our results show that data locality, buffer capacity, physical allocation of processing elements to infrastructure nodes, and packaging for transporting streamed data are important factors in achieving good application performance. Though we evaluate our system primarily for the Linear Road application, we believe it also provides useful insights into the overall system behavior for supporting other distributed and large-scale continuous streaming data applications. Finally, we examine how SPC can be used and tuned to enable a very efficient implementation of the Linear Road application in a distributed environment.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
[1] http://mit.edu/its/mitsimlab.html.
 
2
[2] http://www.cs.brandeis.edu/~linearroad.
 
3
[3] http://www.cs.brown.edu/research/aurora/main.html.
 
4
 
5
[5] D. J. Abadi, Y. Ahmad, M. Balazinska, U. Cetintemel, M. Cherniack, J.-H. Hwang, W. Lindner, A. S. Maskey, A. Rasin, E. Ryvkina, N. Tatbul, Y. Xing, and S. Zdonik. The design of the Borealis stream processing engine. In Proceedings of the 2005 Conference on Innovative Data Systems Research (CIDR 2005), Asilomar, CA, 2005.
 
6
[6] L. Amini, H. Andrade, F. Eskesen, R. King, Y. Park, P. Selo, and C. Venkatramani. The Stream Processing Core. Technical Report RSC 23798 (submitted for publication), IBM T. J. Watson Research Center, November 2005.
 
7
8
 
9
[9] A. Arasu, M. Cherniack, E. Galvez, D. Maier, A. S. Maskey, E. Ryvkina, M. Stonebraker, and R. Tibbetts. Linear Road: A stream data management benchmark. In Proceedings of the 30th International Conference on Very Large Data Bases Conference (VLDB 2004), Toronto, Canada, 2004.
 
10
 
11
[11] S. Chandrasekaran, O. Cooper, A. Deshpande, M. J. Franklin, J. M. Hellerstein, W. Hong, S. Krishnamurthy, S. Madden, V. Raman, F. Reiss, and M. Shah. TelegraphCQ: Continuous dataflow processing for an uncertain world. In Proceedings of the 2003 Conference on Innovative Data Systems Research (CIDR 2003), Asilomar, CA, 2003.
 
12
[12] N. Jain, L. Amini, H. Andrade, R. King, Y. Park, P. Selo, and C. Venkatramani. Design, Implementation, and Evaluation of the Linear Road Benchmark on the Stream Processing Core. Technical Report TR-06-18, Department of Computer Sciences, University of Texas at Austin, March 2006.
 
13
[13] K. Kuo, R. Rabbah, and S. Amarasinghe. A productive programming environment for stream computing. In Proceedings of the 2nd Second Workshop on Productivity and Performance in High-End Computing, San Francisco, CA, February 2005.
14
15
 
16
[16] G. Swint, G. Jung, and C. Pu. Event-based QoS for a distributed continual query system. In Proceedings of the 2005 IEEE International Conference on Information Reuse and Integration (IRI 2005), Las Vegas, NV, August 2005.
 
17
 
18
[18] S. Zdonik, M. Stonebraker, M. Cherniak, U. Cetintemel, M. Balazinska, and H. Balakrishnan. The Aurora and Medusa projects. Bulletin of the IEEE Technical Committee on Data Engineering, March 2003.

CITED BY  10
 
 
 
 

Collaborative Colleagues:
Navendu Jain: colleagues
Lisa Amini: colleagues
Henrique Andrade: colleagues
Richard King: colleagues
Yoonho Park: colleagues
Philippe Selo: colleagues
Chitra Venkatramani: colleagues