ABSTRACT
Big data are a data trend present around us mainly through Internet -- social networks and smart devices and meters -- mostly without us being aware of them. Also they are a fact that both industry and scientific research needs to deal with. They are interesting from analytical point of view, for they contain knowledge that cannot be ignored and left unused. Traditional system that supports the advanced analytics and knowledge extraction -- data warehouse -- is not able to cope with large amounts of fast incoming various and unstructured data, and may be facing a paradigm shift in terms of utilized concepts, technologies and methodologies, which have become a very active research area in the last few years. This paper provides an overview of research trends important for the big data warehousing, concepts and technologies used for data storage and (ETL) processing, and research approaches done in attempts to empower traditional data warehouses for handling big data.
- Abelló, A., Ferrarons, J. and Romero, O. 2011. Building cubes with MapReduce. In Proceedings of the ACM 14th international workshop on Data Warehousing and OLAP (Glasgow, Scotland, UK, 2011). DOLAP '11. ACM Press, 17--24. Google ScholarDigital Library
- Song, I. 2015. Big data technologies, use cases, and research issuses. Slides. In ACM SAC. (Salamanca, Spain, 2015), Retrieved May 2, 2017, from SIGAPP website: https://www.sigapp.org/sac/sac2015/SAC%202015_Keynote_BDT_IY_Song-2.pdfGoogle Scholar
- Bondarev A., Zakirov D. and Zakirov D. 2015. Data warehouse on Hadoop platform for decision support systems in education. In 2015 Twelve International Conference on Electronics Computer and Computation (Almaty, Kazakhstan, 2015), ICECCO 2015. IEEE, 1--4.Google Scholar
- Jukić, N., Sharma, A., Nestorov, S. and Jukić, B. 2015. Augmenting Data Warehouses with Big Data. Information Systems Management, 32, 3 (Jul. 2015), 200--209. Google ScholarDigital Library
- Duda, J. 2012. Business Intelligence and NoSQL Databases. Information Systems in Management, 1, 1 (2012), 25--37.Google Scholar
- Yuan, L.-Y., Wu, L., You, J.-H. and Chi, Y. 2014. Rubato DB: A Highly Scalable Staged Grid Database System for OLTP and Big Data Applications. In 23rd ACM International Conference on Information and Knowledge Management (Melbourne, Australia, 2014). CIKM 2014. ACM Press, 1--10. Google ScholarDigital Library
- Chang, F. et al. 2008. Bigtable: A Distributed Storage System for Structured Data. ACM Transactions on Computer Systems, 26, 2 (Jun. 2008), 1--26. Google ScholarDigital Library
- Thusoo, A. et al. 2010. Hive - a petabyte scale data warehouse using Hadoop. In IEEE 26th International Conference on Data Engineering (Long Beach, CA, USA, 2010). ICDE 2010. IEEE, 996--1005.Google Scholar
- Abouzeid, A., Bajda-Pawlikowski, K., Abadi, D., Silberschatz, A. and Rasin, A. 2009. HadoopDB: an architectural hybrid of MapReduce and DBMS technologies for analytical workloads. Proceedings of the VLDB Endowment. 2, 1 (Aug. 2009), 922--933. Google ScholarDigital Library
- Herodotou, H. et al. 2011. Starfish: A Self-tuning System for Big Data Analytics. In 5th Biennial Conference on Innovative Data Systems Research (Asilomar, CA, USA, 2011). CIDR 2011. 261--272.Google Scholar
- Chen, Y., Xu, C., Rao, W., Min, H. and Su, G. 2015. Octopus: Hybrid Big Data Integration Engine. In IEEE 7th International Conference on Cloud Computing Technology and Science (Vancouver, Canada, 2015). CloudCom 2015. IEEE, 462--466. Google ScholarDigital Library
- A. Kolyadenko. 2015. olap4cloud user guide. Retrieved January 29, 2017, from GitHub akolyadenko repository: https://github.com/akolyadenko/olap4cloud/Google Scholar
- Chevalier, M., El Malki, M., Kopliku, A., Teste, O. and Tournier, R. 2016. Document-oriented data warehouses: Models and extended cuboids. In 2016 IEEE Tenth International Conference on Research Challenges in Information Science (Grenoble, France, 2016). RCIS 2016. IEEE, 1--11.Google Scholar
- Dehdouh, K., Bentayeb, F., Boussaid, O. and Kabachi, N. 2014. Columnar NoSQL CUBE: Agregation operator for columnar NoSQL data warehouse. In 2014 IEEE International Conference on Systems, Man and Cybernetics (San Diego, CA, USA, 2014). SMC 2014. IEEE, 3828--3833.Google ScholarCross Ref
- Dehdouh, K. 2016. Building OLAP Cubes from Columnar NoSQL Data Warehouses. In 6th International Conference Model and Data Engineering (Almería, Spain, 2016), MEDI 2016. Springer International Publishing, 166--179.Google ScholarCross Ref
Index Terms
- Big Data and New Data Warehousing Approaches
Recommendations
Hengam a MapReduce-Based Distributed Data Warehouse for Big Data: A MapReduce-Based Distributed Data Warehouse for Big Data
When working with a high volume of information that follows an exponential pattern, the authors confront big data. This huge amount of information makes big data retrieval and analytics important issues. There have been many attempts to solve data ...
The SusCity Big Data Warehousing Approach for Smart Cities
IDEAS '17: Proceedings of the 21st International Database Engineering & Applications SymposiumNowadays, the concept of Smart City provides a rich analytical context, highlighting the need to store and process vast amounts of heterogeneous data flowing at different velocities. This data is defined as Big Data, which imposes significant ...
Challenges for MapReduce in Big Data
SERVICES '14: Proceedings of the 2014 IEEE World Congress on ServicesIn the Big Data community, MapReduce has been seen as one of the key enabling approaches for meeting continuously increasing demands on computing resources imposed by massive data sets. The reason for this is the high scalability of the MapReduce ...
Comments