|
ABSTRACT
The problem of skyline computation has attracted considerable research attention. In the categorical domain the problem becomes more complicated, primarily due to the partially-ordered nature of the attributes of tuples. In this paper, we initiate a study of streaming categorical skylines. We identify the limitations of existing work for offline categorical skyline computation and realize novel techniques for the problem of maintaining the skyline of categorical data in a streaming environment. In particular, we develop a lightweight data structure for indexing the tuples in the streaming buffer, that can gracefully adapt to tuples with many attributes and partially ordered domains of any size and complexity. Additionally, our study of the dominance relation in the dual space allows us to utilize geometric arrangements in order to index the categorical skyline and efficiently evaluate dominance queries. Lastly, a thorough experimental study evaluates the efficiency of the proposed techniques.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
P. K. Agarwal and M. Sharir. Arrangements and their applications. In Handbook of Computational Geometry, chapter 2, pages 49--119. Elsevier, 2000.
|
| |
2
|
M. J. Atallah and S. Fox, editors. Algorithms and Theory of Computation Handbook. CRC Press, Inc., 1998.
|
| |
3
|
G. Ausiello, M. Protasi, A. Marchetti-Spaccamela, G. Gambosi, P. Crescenzi, and V. Kann. Complexity and Approximation: Combinatorial Optimization Problems and Their Approximability Properties. Springer-Verlag New York, Inc., 1999.
|
| |
4
|
S. Börzsönyi, D. Kossmann, and K. Stocker. The skyline operator. In ICDE, pages 421--430, 2001.
|
| |
5
|
C. Y. Chan, P.-K. Eng, and K.-L. Tan. Stratified computation of skylines with partially-ordered domains. In SIGMOD, pages 203--214, 2005.
|
| |
6
|
C. Y. Chan, H. V. Jagadish, K.-L. Tan, A. K. H. Tung, and Z. Zhang. Finding k-dominant skylines in high dimensional space. In SIGMOD, pages 503--514, 2006.
|
| |
7
|
C. Y. Chan, H. V. Jagadish, K.-L. Tan, A. K. H. Tung, and Z. Zhang. On high dimensional skylines. In EDBT, pages 478--495, 2006.
|
| |
8
|
J. Chomicki, P. Godfrey, J. Gryz, and D. Liang. Skyline with presorting. In ICDE, pages 717--816, 2003.
|
| |
9
|
G. Das, D. Gunopulos, N. Koudas, and N. Sarkas. Ad-hoc top-k query answering for data streams. In VLDB, pages 183--194, 2007.
|
| |
10
|
M. de Berg, M. van Kreveld, M. Overmars, and O. Schwarzkopf. Computational Geometry: Algorithms and Applications (2nd Edition). Springer-Verlag, 2000.
|
| |
11
|
A. M. Frieze and M. Jerrum. Improved approximation algorithms for max k-cut and max bisection. In IPCO, pages 1--13, 1995.
|
| |
12
|
P. Godfrey, R. Shipley, and J. Gryz. Maximal vector computation in large data sets. In VLDB, pages 229--240, 2005.
|
| |
13
|
D. Halperin. Arrangements. In Handbook of Discrete and Computational Geometry, chapter 24, pages 529--562. CRC Press, 2004.
|
| |
14
|
I. F. Ilyas, V. Markl, P. J. Haas, P. Brown, and A. Aboulnaga. Cords: Automatic discovery of correlations and soft functional dependencies. In SIGMOD, pages 647--658, 2004.
|
| |
15
|
H. V. Jagadish, N. Koudas, S. Muthukrishnan, V. Poosala, K. C. Sevcik, and T. Suel. Optimal histograms with quality guarantees. In VLDB, pages 275--286, 1998.
|
| |
16
|
D. Kossmann, F. Ramsak, and S. Rost. Shooting stars in the sky: An online algorithm for skyline queries. In VLDB, pages 275--286, 2002.
|
| |
17
|
N. Koudas, B. C. Ooi, K.-L. Tan, and R. Zhang. Approximate nn queries on streams with guaranteed error/performance bounds. In VLDB, pages 804--815, 2004.
|
| |
18
|
K. Lee, B. Zheng, H. Li, and W.-C. Lee. Approaching the skyline in z order. In VLDB, pages 279--290, 2007.
|
| |
19
|
X. Lin, Y. Yuan, W. Wang, and H. Lu. Stabbing the sky: Efficient skyline computation over sliding windows. In ICDE, pages 502--513, 2005.
|
| |
20
|
X. Lin, Y. Yuan, Q. Zhang, and Y. Zhang. Selecting stars: The k most representative skyline operator. In ICDE, pages 86--95, 2007.
|
| |
21
|
M. D. Morse, J. M. Patel, and W. I. Grosky. Efficient continuous skyline computation. In ICDE, page 108, 2006.
|
| |
22
|
K. Mouratidis, S. Bakiras, and D. Papadias. Continuous monitoring of top-k queries over sliding windows. In SIGMOD, pages 635--646, 2006.
|
| |
23
|
D. Papadias, Y. Tao, G. Fu, and B. Seeger. An optimal and progressive algorithm for skyline queries. In SIGMOD, pages 467--478, 2003.
|
| |
24
|
K.-L. Tan, P.-K. Eng, and B. C. Ooi. Efficient progressive skyline computation. In VLDB, pages 301--310, 2001.
|
| |
25
|
Y. Tao and D. Papadias. Maintaining sliding window skylines on data streams. IEEE TKDE., 18(2):377--391, 2006.
|
| |
26
|
P. Wu, D. Agrawal, Ö. Egecioglu, and A. E. Abbadi. Deltasky: Optimal maintenance of skyline deletions without exclusive dominance region generation. In ICDE, pages 486--495, 2007.
|
|