ACM Home Page
Please provide us with feedback. Feedback
Scalable computation of acyclic joins
Full text PdfPdf (204 KB)
Source Symposium on Principles of Database Systems archive
Proceedings of the twenty-fifth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems table of contents
Chicago, IL, USA
SESSION: Query processing table of contents
Pages: 225 - 232  
Year of Publication: 2006
ISBN:1-59593-318-2
Authors
Anna Pagh  IT University of Copenhagen, København, Denmark
Rasmus Pagh  IT University of Copenhagen, København, Denmark
Sponsors
SIGMOD: ACM Special Interest Group on Management of Data
SIGART: ACM Special Interest Group on Artificial Intelligence
ACM: Association for Computing Machinery
SIGACT: ACM Special Interest Group on Algorithms and Computation Theory
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 3,   Downloads (12 Months): 36,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
Save this Article to a Binder    Display Formats: BibTex  EndNote ACM Ref   
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1142351.1142384
What is a DOI?

ABSTRACT

The join operation of relational algebra is a cornerstone of relational database systems. Computing the join of several relations is NP-hard in general, whereas special (and typical) cases are tractable. This paper considers joins having an acyclic join graph, for which current methods initially apply a full reducer to efficiently eliminate tuples that will not contribute to the result of the join. From a worst-case perspective, previous algorithms for computing an acyclic join of k fully reduced relations, occupying a total of n≥k blocks on disk, use Ω((n+z)k) I/Os, where z is the size of the join result in blocks.In this paper we show how to compute the join in a time bound that is within a constant factor of the cost of running a full reducer plus sorting the output. For a broad class of acyclic join graphs this is O(sort(n+z)) I/Os, removing the dependence on k from previous bounds. Traditional methods decompose the join into a number of binary joins, which are then carried out one by one. Departing from this approach, our technique is based on computing the size of certain subsets of the result, and using these sizes to compute the location(s) of each data item in the result.Finally, as an initial study of cyclic joins in the I/O model, we show how to compute a join whose join graph is a 3-cycle, in O(n2/m+sort(n+z)) I/Os, where m is the number of blocks in internal memory.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
 
2
I. Baran, E. D. Demaine, and M. Patrascu. Subquadratic algorithms for 3SUM. In Proceedings of WADS, volume 3608 of Lecture Notes in Computer Science, pages 409--421, 2005.
3
4
 
5
6
 
7
Y. E. Ioannidis. Query optimization. In Computer Science Handbook, Second Edition, chapter 55. Chapman & Hall/CRC, 2004.
8
9
10
 
11
12
 
13
 
14
D. E. Willard. An algorithm for handling many relational calculus queries efficiently. J. Comput. System Sci., 65(2):295--331, 2002.
 
15
M. Yannakakis. Algorithms for acyclic database schemes. In 7th International Conference on Very Large Data Bases (VLDB), pages 82--94. IEEE, 1981.
 
16
C. T. Yu and M. Z. Ozsoyoglu. An algorithm for tree-query membership of a distributed query. In Proceedings of Computer Software and Applications Conference, pages 306--312. IEEE, 1979.