skip to main content
10.1145/2812428.2812429acmotherconferencesArticle/Chapter ViewAbstractPublication PagescompsystechConference Proceedingsconference-collections
research-article

Database technologies in the world of big data

Published:25 June 2015Publication History

ABSTRACT

Now we have a number of database technologies called usually NoSQL, like key-value, column-oriented, and document stores as well as search engines and graph databases. Whereas SQL software vendors offer advanced products with the capability to handle highly complex queries and transactions, NoSQL databases share rather characteristics concerning scaling and performance, as e.g. auto-sharding, distributed query support, and integrated caching. Their drawbacks can be a lack of schema or data consistency, difficulty in testing and maintaining, and absence of a higher query language. Complex data modelling and the SQL language as the only access tool to data are missing here. On the other hand, last studies show that both SQL and NoSQL databases have value for both for transactional and analytical Big Data. Top databases providers offer rearchitected database technologies combining row data stores with columnar in-memory compression enabling processing large data sets and analytical querying, often over massive, continuous data streams. The technological progress led to development of massively parallel processing analytic databases. The paper presents some details of current database technologies, their pros and cons in different application environments, and emerging trends in this area.

References

  1. Abramova, V, Bernardino, J., Furtado, P. Which NoSQL Database? A Performance Overview. Open Journal of Databases (OJDB), Vol. 1, No. 2, 2014, pp. 17--24.Google ScholarGoogle Scholar
  2. Brewer, E. A.: Towards robust distributed systems. Invited Talk on PODC 2000, Portland, Oregon, 16-19 July, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Brewer, E. A.: CAP twelve years later: how the 'rules' have changed. Computer, Vol. 45, No. 2, 2012, pp. 22--29.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Bu, Y., Howe, Y., Balazinska, M, Ernstm M. D. The HaLoop approach to large-scale iterative data analysis. The VLDB Journal, Vol. 21, No. 2, 2012, pp. 169--190. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Chen, C. L. Ph., Zhang, CH.-Y.: Data-intensive applications, challenges, techniques and technologies: A survey on Big Data. Information Sciences 275 (2014), pp. 314--347.Google ScholarGoogle ScholarCross RefCross Ref
  6. Corbett, J. C., Dean, J. C., Epstein, M. et al. Spanner: Google's Globally-Distributed Database. In: Proc. of 10th USENIX Symposium on Operation Systems Design and Implementation (OSDI 2012), Hollywood, 2012, pp. 261--264. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Dean, D., Ghemawat, S. MapReduce: Simplified Data Processing on Large Clusters. Communications the ACM, 51(1), 2008, pp.107--113. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. EnterpriseDB Corporation. Using the NoSQL Capabilities in Postgres. White Paper, 2014.Google ScholarGoogle Scholar
  9. Gates, A., Natkovich, O., Chopra, S., Kamath, P., Narayanamurthy, et al. Building a high level dataflow system on top of MapReduce: The pig experience. PVLDB, 2(2), 2009, pp. 1414--1425. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Grolinger, K., Higashino, W. A., Tiwari, A., and Capretz, M. A. M.: Data management in cloud environments: NoSQL and NewSQL data stores. Journal of Cloud Computing: Advances, Systems and Applications, 2:22, 2013, pp. 1--24.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Google®: BigQuery Analytics. John Wiley & Sons, Inc., 2014.Google ScholarGoogle Scholar
  12. Hellerstein, J., Stonebraker, M. Anatomy of a Database System. Chapter 1 in Reading in Database Systems, 4th Edition, MIT Press Book, 2005, pp. 42--54.Google ScholarGoogle Scholar
  13. Lokegaonkar, S., Joshi, A.: Concurrency Control Schemes in NeWSQL Systems. Int. Journal of Computer Engineering and Technology (IJCET), Volume 5, Issue 8, August (2014), pp. 97--104.Google ScholarGoogle Scholar
  14. Malewicz, G., Austern, M. H., Bik, A. J. C., Dehnert, J. C., Horn, I., Leiser, N., and Czajkowski, G.: Pregel: a system for large-scale graph processing. In: Proc. of SIGMOD '10 Proc. of the 2010 ACM SIGMOD Int. Conf. on Management of data, 2010, pp. 135--146. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Melnik, S., Gubarev, A., Long, J. J., Romer, G., Shivakumar, S., Tolton, M. and Vassilakis. T.: Dremel: Interactive analysis of web-scale datasets. In: Proc. of the 36th Int'l Conf on Very Large Data Bases, 2010, pp. 330--339. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Mohamed, M., A., Altrafi, O. G., Ismail, M. O. Relational vs. NoSQL Databases: A Survey. Int. Journal of Computer and Information Technology, Vol. 03, No. 03, 2014, pp. 598--601.Google ScholarGoogle Scholar
  17. O'Neil, P., Cheng, E., Gawlick, D., O'Neil, E.: The log-structured merge-tree (LSM-tree). In: Acta Inf. 33 (1996), No. 4, pp. 351--385. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Pokorny, J. NoSQL Databases: a step to databases scalability in Web environment. International Journal of Web Information Systems, 9 (1), 2013, pp. 69--82,Google ScholarGoogle ScholarCross RefCross Ref
  19. Pokorný, J. New Database Architectures: Steps Towards Big Data Processing. In: Proc. of IADIS European Conference on Data Mining (ECDM'13), António Palma dos Reis and Ajith P. Abraham Eds., IADIS Press, 2013, pp. 3--10.Google ScholarGoogle Scholar
  20. Radenski, A. Big Data, High-Performance Computing, and MapReduce. In: Proc. CompSysTech'14, June 27-28, 2014, Ruse, Bulgaria, 2014, pp. 13--24. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Rosenthal, D.: Next gen NoSQL: The demise of eventual consistency?, 2013 https://gigaom.com/2013/11/02/next-gen-nosql-the-demise-of-eventual-consistency/Google ScholarGoogle Scholar
  22. Valiant, L. G.: A bridging model for parallel computation, Communications of the ACM, Volume 33, Issue 8, Aug. 1990, pp. 103--111. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Shute, J., Vingralek, R., Samwel, B., Handy, B., Whipkey, Ch., et al. F1 A Distributed SQL Database That Scales. PVLDB 6(11), 2013, pp. 1068--1079. Google ScholarGoogle ScholarDigital LibraryDigital Library
  1. Database technologies in the world of big data

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Other conferences
        CompSysTech '15: Proceedings of the 16th International Conference on Computer Systems and Technologies
        June 2015
        411 pages
        ISBN:9781450333573
        DOI:10.1145/2812428

        Copyright © 2015 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 25 June 2015

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        Overall Acceptance Rate241of492submissions,49%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader