ABSTRACT
Existing SQL aggregate functions present important limitations to compute percentages. This article proposes two SQL aggregate functions to compute percentages addressing such limitations. The first function returns one row for each percentage in vertical form like standard SQL aggregations. The second function returns each set of percentages adding 100% on the same row in horizontal form. These novel aggregate functions are used as a framework to introduce the concept of percentage queries and to generate efficient SQL code. Experiments study different percentage query optimization strategies and compare evaluation time of percentage queries taking advantage of our proposed aggregations against queries using available OLAP extensions. The proposed percentage aggregations are easy to use, have wide applicability and can be efficiently evaluated.
- S. Chaudhuri and U. Dayal. An overview of data warehousing and OLAP technology. SIGMOD Record, 26(1):65--74, 1997. Google ScholarDigital Library
- J. Clear, D. Dunn, B. Harvey, M. L. Heytens, and P. Lohman. Non-stop SQL/MX primitives for knowledge discovery. In ACM KDD Conference, pages 425--429, 1999. Google ScholarDigital Library
- G. Graefe, U. Fayyad, and S. Chaudhuri. On the efficient gathering of sufficient statistics for classification from large SQL databases. In ACM KDD Conference, pages 204--208, 1998.Google Scholar
- J. Gray, A. Bosworth, A. Layman, and H. Piharesh. A relational aggregation operator generalizing group-by, cross-tab and sub-total. In ICDE Conference, 1996. Google ScholarDigital Library
- J. Han, J. Pei, G. Dong, and K. Wang. Efficient computation of iceberg cubes with complex measures. In ACM SIGMOD Conference, pages 1--12, 2001. Google ScholarDigital Library
- ISO-ANSI. Amendment 1: On-Line Analytical Processing, SQL/OLAP, pages 46--55. ANSI, 1999.Google Scholar
- J. Widom. Research poblems in data warehousing. In ACM CIKM Conference, pages 25--30, 1995. Google ScholarDigital Library
- A. Witkowski, S. Bellamkonda, T. Bozkaya, G. Dorman, N. Folkert, A. Gupta, L. Sheng, and S. Subramanian. Spreadsheets in RDBMS for OLAP. In ACM SIGMOD Conference, pages 52--63, 2003. Google ScholarDigital Library
- M. Zaharioudakis, M. Cochrane, R. Lapis, H. Piharesh, and M. Urata. Answering complex SQL queries using automatic summary tables. In ACM SIGMOD Conference, pages 105--116, 2000. Google ScholarDigital Library
- Y. Zhuge, H. Garcia-Molina, and J. Hammer. View maintenance in a warehousing environment. In ACM SIGMOD Conference, pages 316--327, 1995. Google ScholarDigital Library
- Vertical and horizontal percentage aggregations
Recommendations
Dynamic optimization of generalized SQL queries with horizontal aggregations
SIGMOD '12: Proceedings of the 2012 ACM SIGMOD International Conference on Management of DataSQL presents limitations to return aggregations as tables with a horizontal layout. A user generally needs to write separate queries and data definition statements to combine transposition with aggregation. With that motivation in mind, we introduce ...
Horizontal Aggregations in SQL to Prepare Data Sets for Data Mining Analysis
Preparing a data set for analysis is generally the most time consuming task in a data mining project, requiring many complex SQL queries, joining tables, and aggregating columns. Existing SQL aggregations have limitations to prepare data sets because ...
Horizontal aggregations for building tabular data sets
DMKD '04: Proceedings of the 9th ACM SIGMOD workshop on Research issues in data mining and knowledge discoveryIn a data mining project, a significant portion of time is devoted to building a data set suitable for analysis. In a relational database environment, building such data set usually requires joining tables and aggregating columns with SQL queries. ...
Comments