skip to main content
10.1145/2588555.2612175acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

Overlap interval partition join

Published:18 June 2014Publication History

ABSTRACT

Each tuple in a valid-time relation includes an interval attribute T that represents the tuple's valid time. The overlap join between two valid-time relations determines all pairs of tuples with overlapping intervals. Although overlap joins are common, existing partitioning and indexing schemes are inefficient if the data includes long-lived tuples or if intervals intersect partition boundaries.

We propose Overlap Interval Partitioning (OIP), a new partitioning approach for data with an interval. OIP divides the time range of a relation into k base granules and defines overlapping partitions for sequences of contiguous granules. OIP is the first partitioning method for interval data that gives a constant clustering guarantee: the difference in duration between the interval of a tuple and the interval of its partition is independent of the duration of the tuple's interval. We offer a detailed analysis of the average false hit ratio and the average number of partition accesses for queries with overlap predicates, and we prove that the average false hit ratio is independent of the number of short- and long-lived tuples. To compute the overlap join, we propose the Overlap Interval Partition Join (OIPJoin), which uses OIP to partition the input relations on-the-fly. Only the tuples from overlapping partitions have to be joined to compute the result. We analytically derive the optimal number of granules, k, for partitioning the two input relations, from the size of the data, the cost of CPU operations, and the cost of main memory or disk IOs. Our experiments confirm the analytical results and show that the OIPJoin outperforms state-of-the-art techniques for the overlap join.

References

  1. The webkit open source project. http://www.webkit.org, 2012.Google ScholarGoogle Scholar
  2. N. Beckmann, H.-P. Kriegel, R. Schneider, and B. Seeger. The R*-tree: An efficient and robust access method for points and rectangles. In SIGMOD, pages 322--331, 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. N. Beckmann and B. Seeger. A revised R*-tree in comparison with related index structures. In SIGMOD, pages 799--812, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. J. L. Bentley. Solutions to klee's rectangle problems. Technical report, Carnegie Mellon University, 1977.Google ScholarGoogle Scholar
  5. M. Berg, M. Kreveld, M. Overmars, and O. Schwarzkopf. More geometric data structures. In Computational Geometry, pages 211--233. Springer Berlin Heidelberg, 2000.Google ScholarGoogle ScholarCross RefCross Ref
  6. A. E. Brouwer, A. M. Cohen, and A. Neumaier. Distance Regular Graphs. Springer-Verlag, 1989.Google ScholarGoogle ScholarCross RefCross Ref
  7. A. Dignös, M. Böhlen, and J. Gamper. Query time scaling of attribute values in interval timestamped databases. In ICDE, pages 1304--1307, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. A. Dignös, M. H. Böhlen, and J. Gamper. Temporal alignment. In SIGMOD, pages 433--444, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. J. Enderle, M. Hampel, and T. Seidl. Joining interval data in relational databases. In SIGMOD, pages 683--694, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. R. A. Finkel and J. L. Bentley. Quad trees: A data structure for retrieval on composite keys. Acta Inf., 4:1--9, 1974.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. D. Gao, C. S. Jensen, R. T. Snodgrass, and M. D. Soo. Join operations in temporal databases. VLDB J., 14(1):2--29, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. J. A. G. Gendrano, R. Shah, R. T. Snodgrass, and J. Yang. University information system (uis) dataset. TimeCenter CD-1, 1998.Google ScholarGoogle Scholar
  13. N. Koudas and K. C. Sevcik. Size separation spatial join. In SIGMOD, pages 324--335, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. H.-P. Kriegel, M. Pötke, and T. Seidl. Managing intervals efficiently in object-relational databases. In VLDB, pages 407--418, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. M.-L. Lo and C. V. Ravishankar. Spatial hash-joins. In SIGMOD, pages 247--258, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. H. Lu, B. C. Ooi, and K.-L. Tan. On spatially partitioned temporal join. In VLDB, pages 546--557, 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. S. Nobari, F. Tauheed, T. Heinis, P. Karras, S. Bressan, and A. Ailamaki. Touch: In-memory spatial join by hierarchical data-oriented partitioning. In SIGMOD, pages 701--712, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. H. Samet. Foundations of Multidimensional and Metric Data Structures. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. H. Samet, J. Sankaranarayanan, and M. Auerbach. Indexing methods for moving object databases: games and other applications. In SIGMOD, pages 169--180, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. A. Silberschatz, P. B. Galvin, and G. Gagne. Operating System Concepts. Wiley Publishing, 9th edition, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. M. D. Soo, R. T. Snodgrass, and C. S. Jensen. Efficient evaluation of the valid-time natural join. In ICDE, pages 282--292, 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. V. J. Tsotras and N. Kangerlaris. The snapshot index: An i/o-optimal access method for timeslice queries. Inf. Syst., 20(3):237--260, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. T. Ulrich. Loose octrees. In Game Programming Gems, pages 444--453. Charles River Media, 2000.Google ScholarGoogle Scholar

Index Terms

  1. Overlap interval partition join

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        SIGMOD '14: Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data
        June 2014
        1645 pages
        ISBN:9781450323765
        DOI:10.1145/2588555

        Copyright © 2014 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 18 June 2014

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        SIGMOD '14 Paper Acceptance Rate107of421submissions,25%Overall Acceptance Rate785of4,003submissions,20%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader