skip to main content
research-article

Reducing Layered Database Applications to their Essence through Vertical Integration

Published:23 October 2015Publication History
Skip Abstract Section

Abstract

In the last decade, improvements on single-core performance of CPUs has stagnated. Consequently, methods for the development and optimization of software for these platforms have to be reconsidered. Software must be optimized such that the available single-core performance is exploited more effectively. This can be achieved by reducing the number of instructions that need to be executed. In this article, we show that layered database applications execute many redundant, nonessential, instructions that can be eliminated without affecting the course of execution and the output of the application. This elimination is performed using a vertical integration process which breaks down the different layers of layered database applications. By doing so, applications are being reduced to their essence, and as a consequence, transformations can be carried out that affect both the application code and the data access code which were not possible before. We show that this vertical integration process can be fully automated and, as such, be integrated in an operational workflow. Experimental evaluation of this approach shows that up to 95% of the instructions can be eliminated. The reduction of instructions leads to a more efficient use of the available hardware resources. This results in greatly improved performance of the application and a significant reduction in energy consumption.

References

  1. Frances E. Allen and John Cocke. 1976. A program data flow analysis procedure. Commun. ACM 19, 3 (March 1976), 137. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. John R. Allen. 1983. Dependence analysis for subscripted variables and its applications to program transformations. Ph.D. Dissertation, Rice University. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Randy Allen and Ken Kennedy. 1987. Automatic translation of FORTRAN programs to vector form. ACM Trans. Program. Lang. Syst. 9, 4 (October 1987), 491--542. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Khalil Amiri, Sanghyun Park, Renu Tewari, and Sriram Padmanabhan. 2003. DBProxy: A dynamic data cache for Web applications. In Proceedings of the 19th International Conference on Data Engineering. IEEE Computer Society, 821--831. 1063-6382Google ScholarGoogle ScholarCross RefCross Ref
  5. Henrique Andrade, Suresh Aryangat, Tahsin Kurc, Joel Saltz, and Alan Sussman. 2004. Efficient execution of multi-query data analysis batches using compiler optimization strategies. In Languages and Compilers for Parallel Computing, Lawrence Rauchwerger (Ed.)., Lecture Notes in Computer Science, Vol. 2958, Springer, Berlin Heidelberg, 509--523.Google ScholarGoogle Scholar
  6. Mahendra Chavan, Ravindra Guravannavar, Karthik Ramachandra, and S. Sudarshan. 2011. DBridge: A program rewrite tool for set-oriented query execution. In Proceedings of the IEEE 27th International Conference on Data Engineering (ICDE). IEEE Computer Society, 1284--1287. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Alvin Cheung, Owen Arden, Samuel Madden, Armando Solar-Lezama, and Andrew C. Myers. 2013. StatusQuo: Making familiar abstractions perform using program analysis. In Proceedings of the Conference on Innovative Data Systems Research (CIDR).Google ScholarGoogle Scholar
  8. William R. Cook and Ali H. Ibrahim. 2005. Integrating programming languages and databases: What is the problem. ODBMS.ORG, Expert Article.Google ScholarGoogle Scholar
  9. EPA. 2007. Report to Congress on Server and Data Center Energy Efficiency. U.S. Environmental Protection Agency. (2007).Google ScholarGoogle Scholar
  10. Xiaobo Fan, Wolf-Dietrich Weber, and Luiz Andre Barroso. 2007. Power provisioning for a warehouse-sized computer. SIGARCH Comput. Archit. News 35, 2 (June 2007), 13--23. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Charles Garrod, Amit Manjhi, Bruce M. Maggs, Todd C. Mowry, and Anthony Tomasic. 2008. Holistic application analysis for update-independence. In Proceedings of the 2nd IEEE Workshop on Hot Topics in Web Systems and Technologies (HotWeb).Google ScholarGoogle Scholar
  12. Andreas Gawecki and Florian Matthes. 2000. Integrating query and program optimization using persistent CPS representations. In Fully Integrated Data Environments, Malcolm P. Atkinson and Ray Welland (Eds.), Esprit Basic Research Series 2000, Springer, Berlin Heidelberg, 496--501.Google ScholarGoogle Scholar
  13. Joseph (Yossi) Gil, and Keren Lenz. 2007. Eliminating impedance mismatch in C++. In Proceedings of the 33rd International Conference on Very Large Data Bases (VLDB'07). VLDB Endowment, 1386--1389. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. HipHop for PHP. 2012. HipHop for PHP Project. https://github.com/facebook/hiphop-php/wiki/. (Last accessed July 2012).Google ScholarGoogle Scholar
  15. Myong H. Kang, Henry G. Dietz, and Bharat K. Bhargava. 1994. Multiple-query optimization at algorithm-level. Data Knowl. Eng. 14, 1 (1994), 57--75. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Ken Kennedy. 1981. A Survey of Data Flow Analysis Techniques. Prentice-Hall, Englewood Cliffs, NJ. 5--54.Google ScholarGoogle Scholar
  17. Ken Kennedy and Kathryn McKinley. 1994. Maximizing loop parallelism and improving data locality via loop fusion and distribution. In Languages and Compilers for Parallel Computing, Utpal Banerjee, David Gelernter, Alex Nicolau, and David Padua (Eds.), Lecture Notes in Computer Science, Vol. 768, Springer, Berlin Heidelberg, 301--320. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Jonathan G. Koomey. 2011. Growth in data center electricity use 2005 to 2010. http://www.analyticspress.com/datacenters.html.Google ScholarGoogle Scholar
  19. Konstantinos Krikellas, Stratis D. Viglas, and Marcelo Cintra. 2010. Generating code for holistic query evaluation. In Proceedings of the IEEE 26th International Conference on Data Engineering (ICDE). IEEE Computer Society, 613--624.Google ScholarGoogle ScholarCross RefCross Ref
  20. David J. Kuck, Robert H. Kuhn, David A. Padua, Bruce Leasure, and Michael Wolfe. 1981. Dependence graphs and compiler optimizations. In Proceedings of the 8th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL'81). ACM, New York, NY, 207--218. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Daniel F. Lieuwen. 1998. Parallelizing loops in database programming languages. In Proceedings of the 14th International Conference on Data Engineering. 86--93. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Daniel F. Lieuwen and David J. DeWitt. 1992. A transformation-based approach to optimizing loops in database programming languages. In Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD). 91--100. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Qiong Luo, Sailesh Krishnamurthy, C. Mohan, Hamid Pirahesh, Honguk Woo, Bruce G. Lindsay, and Jeffrey F. Naughton. 2002. Middle-tier database caching for e-business. In Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD'02). ACM, New York, NY, 600--611. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. David Maier. 1990. Representing Database Programs as Objects. ACM, New York, NY, 377--386. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Amit Manjhi, Charles Garrod, Bruce M. Maggs, Todd C. Mowry, and Anthony Tomasic. 2009. Holistic query transformations for dynamic web applications. In Proceedings of the IEEE International Conference on Data Engineering. IEEE Computer Society, 1175--1178. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Erik Meijer, Brian Beckman, and Gavin Bierman. 2006. LINQ: Reconciling object, relations and XML in the .NET framework. In Proceedings of the ACM SIGMOD International Conference on Management of Data. ACM, 706--706. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. MonetDB. 2013. MonetDB Project. http://www.monetdb.org/.(Last accessed, February 2013).Google ScholarGoogle Scholar
  28. Thomas Neumann. 2011. Efficiently compiling efficient query plans for modern hardware. Proc. VLDB Endow. 4, 9 (June 2011), 539--550. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. ObjectWeb Consortium. 2005. JMOB - RUBBoS Benchmark. http://jmob.ow2.org/rubbos.html. (Last accessed July 2012.)Google ScholarGoogle Scholar
  30. ObjectWeb Consortium. 2008. RUBiS - Home Page. http://rubis.ow2.org/. (Last accessed July 2012.)Google ScholarGoogle Scholar
  31. Oracle. 2011. Oracle In-Memory Database Cache. http://www.oracle.com/us/products/database/in-memory-database-cache-066510.html (Last accessed March 2011.)Google ScholarGoogle Scholar
  32. David A. Padua. and Michael J. Wolfe. 1986. Advanced compiler optimizations for supercomputers. Commun. ACM 29, 12 (Dec. 1986), 1184--1201. DOT:http://dx.doi.org/10.1145/7902.7904 Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Christian Plattner, Gustavo Alonso, and M. Tamer Özsu. 2008. Extending DBMSs with satellite databases. VLDB J. 17, 4 (July 2008), 657--682. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Kristian F. D. Rietveld and Harry A. G. Wijshoff. 2013a. Quantifying energy usage in data centers through instruction-count overhead. In Proceedings of the 2nd International Conference on Smart Grids and Green IT Systems (SMARTGREENS'13). 189--198.Google ScholarGoogle Scholar
  35. Kristian F. D. Rietveld and Harry A. G. Wijshoff. 2013b. To cache or not to cache: A trade-off analysis for locally cached database systems. In Proceedings of the ACM International Conference on Computing Frontiers. 31. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Kristian F. D. Rietveld and Harry A. G. Wijshoff. 2015. Re-engineering compiler transformations to outperform database query optimizers. In Languages and Compilers for Parallel Computing, James Brodman and Peng Tu (Eds.), Lecture Notes in Computer Science, Vol. 8967, Springer International Publishing, 300--314.Google ScholarGoogle Scholar
  37. Swaminathan Sivasubramanian, Guillaume Pierre, Maarten van Steen, and Gustavo Alonso. 2007. Analysis of caching and replication strategies for web applications. IEEE Int. Comput. 11, 1 (2007), 60--66. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. TPC. 2009. TPC-H. Transaction Processing Performance Council. http://tpc.org/tpch/default.asp. (Last accessed May 2009.)Google ScholarGoogle Scholar
  39. Michael Wolfe. 1988. Vector optimization vs vectorization. J. Parallel Distrib. Comput. 5, 5 (1988), 551--567. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Guoqing Xu, Matthew Arnold, Nick Mitchell, Atanas Rountev, and Gary Sevitsky. 2009. Go with the flow: Profiling copies to find runtime bloat. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI'09). ACM, New York, NY, 419--430. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. David P. Yach, James D. Graham, and Anthony F. Scian. 2002. Database system with methodology for accessing a database from portable devices. U.S. Patent 6341288, Filed July 29, 1998, and issued January 22, 2002.Google ScholarGoogle Scholar
  42. Hans Zima and Barbara Chapman. 1991. Supercompilers for Parallel and Vector Computers. ACM, New York, NY. Google ScholarGoogle Scholar

Index Terms

  1. Reducing Layered Database Applications to their Essence through Vertical Integration

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in

          Full Access

          • Published in

            cover image ACM Transactions on Database Systems
            ACM Transactions on Database Systems  Volume 40, Issue 3
            October 2015
            247 pages
            ISSN:0362-5915
            EISSN:1557-4644
            DOI:10.1145/2838914
            Issue’s Table of Contents

            Copyright © 2015 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 23 October 2015
            • Accepted: 1 May 2015
            • Revised: 1 October 2014
            • Received: 1 December 2013
            Published in tods Volume 40, Issue 3

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article
            • Research
            • Refereed

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader