ACM Home Page
Please provide us with feedback. Feedback
Prefilter: predicate pushdown at streaming speeds
Full text PdfPdf (423 KB)
Source ACM International Conference Proceeding Series; Vol. 261 archive
Proceedings of the 2nd international workshop on Scalable stream processing system table of contents
Nantes, France
SESSION: Query optimization table of contents
Pages 29-37  
Year of Publication: 2008
ISBN:978-159593-963-0
Authors
Lukasz Golab  AT&T Labs - Research
Theodore Johnson  AT&T Labs - Research
Oliver Spatscheck  AT&T Labs - Research
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 2,   Downloads (12 Months): 18,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
Save this Article to a Binder    Display Formats: BibTex  EndNote ACM Ref   
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1379272.1379280
What is a DOI?

ABSTRACT

This paper presents the prefilter: a predicate pushdown framework for a Data Stream Management System (DSMS). Though early predicate evaluation is a well-known query optimization strategy, novel problems arise in a high-performance DSMS. In particular, (i) query invocation costs are high as compared to the cost of evaluating simple predicates that are often used in high-speed stream analysis; (ii) selectivity estimates may become inaccurate over time; and (iii) multiple queries, possibly containing common subexpressions, must be processed continuously. The prefilter addresses these issues by constructing appropriate predicates for early evaluation as soon as new data arrive and before any queries are invoked. It also compresses the bit vector representing the outcomes of pushed-down predicates over newly arrived tuples, and uses the compressed bitmap to efficiently check which queries do not have to be invoked. Using a set of network monitoring queries, we show that the performance of the Gigascope DSMS is significantly improved by the prefilter.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
R. Agrawal, T. Imielinski, and A. Swami. Mining association rules between sets of items in large databases. SIGMOD 1993, 207--216.
 
2
B. Babcock and S. Chaudhuri. Towards a robust query optimizer: a principled and practical approach. SIGMOD 2005, 119--130.
 
3
B. Babcock, M. Datar, and R. Motwani. Load shedding for aggregation queries over data streams. ICDE 2004, 350--361.
 
4
S. Chandrasekaran and M. J. Franklin. PSoup: a system for streaming queries over streaming data. The VLDB Journal, 12(2): 140--156, 2003.
 
5
J. Chen, D. DeWitt, and J. Naughton. Design and Evaluation of Alternative Selection Placement Strategies in Optimizing Continuous Queries. ICDE 2002, 345--357.
 
6
J. Chen, D. DeWitt, F. Tian, and Y. Wang. NiagaraCQ: A scalable continuous query system for Internet databases. SIGMOD 2000, 379--390.
 
7
G. Cormode et al. Holistic UDAFs at streaming speeds. SIGMOD 2004, 35--46.
 
8
C. Cranor, T. Johnson, O. Spatscheck, and V. Shkapenyuk. Gigascope: High performance network monitoring with an SQL interface. SIGMOD 2003, 647--651.
 
9
M. Denny and M. Franklin. Predicate result range caching for continuous queries. SIGMOD 2005, 646--657.
 
10
F. Fabret et al. Filtering algorithms and implementation for very fast publish/subscribe systems. SIGMOD 2001, 115--126.
 
11
M. Garey and D. Johnson. Computers and Intractability. A Guide to the Theory of NP-Completeness. W. H. Freeman, 1979.
 
12
M. Karnaugh. The map method for synthesis of combinational logic circuits. Journal of Symbolic Logic, 20(2):197, 1955.
 
13
S. Krishnamurthy, M. Franklin, J. Hellerstein, and G. Jacobson. The case for precision sharing. VLDB 2004, 972--986.
 
14
H.-S. Lim et al. Continuous query processing in data streams using duality of data and queries. SIGMOD 2006, 313--324.
 
15
S. Madden, M. Shah, J. Hellerstein, and V. Raman. Continuously adaptive continuous queries over streams. SIGMOD 2002, 49--60.
 
16
R. Motwani et al. Query processing, approximation, and resource management in a data stream management system. CIDR 2003, 245--256.
 
17
K. Munagala, U. Srivastava, and J. Widom. Optimization of continuous queries with shared expensive filters. PODS 2007, 215--224.
 
18
S. Muthukrishnan. Data streams: Algorithms and applications. Foundations and Trends in Theoretical Computer Science, 1(2): 1--67, 2005.
 
19
P. Roy, S. Seshadri, S. Sudarshan, and S. Bhobe. Efficient and extensible algorithms for multi query optimization. SIGMOD 2000, 249--260.
 
20
T. Sellis. Multiple-query optimization. ACM Trans. Database Sys., 13(1):23--52, 1988.
 
21
N. Tatbul et al. Load shedding in a data stream manager. VLDB 2003, 309--320.
 
22
K.-L. Wu, S.-K. Chen, and P. Yu. Interval query indexing for efficient stream processing. CIKM 2004, 88--97.
 
23
R. Zhang, N. Koudas, B. C. Ooi, and D. Srivastava. Multiple aggregations over data streams. SIGMOD 2005, 299--310.

Collaborative Colleagues:
Lukasz Golab: colleagues
Theodore Johnson: colleagues
Oliver Spatscheck: colleagues