ABSTRACT
In-memory data grid (IMDG) is a novel data processing middleware for Internetware. It provides higher scalability and performance compared with traditional rational database. However, because the data stored in IMDG must follow the key/value data model, new challenges have been proposed. One important aspect is that IMDG does not support standard data accessing languages such as JPA and SQL, and application developers must design their programs according to the peculiarities of an IMDG product. This results in complex and error-prone code, especially for the programmers who have no deep understanding of IMDG. In this paper, we propose a data accessing reference architecture for IMDG and a methodology to design and implement its data accessing layer. In this methodology, data accessing engine construction, data model designation and join operation supporting are presented. Moreover, following this methodology, we develop and implement a JPA compatible data accessing engine for Hazelcast as a case study, which proves the feasibility of our approach.
- Hasso Plattner. 2009. A common database approach for OLTP and OLAP using an in-memory column database. In Proceedings of the 2009 ACM SIGMOD International Conference on Management of data (SIGMOD '09), Carsten Binnig and Benoit Dageville (Eds.). ACM, New York, NY, USA, 1--2. Google ScholarDigital Library
- J. Dean and S. Ghemawat. MapReduce: Simplified data processing on large clusters. Communications of the ACM, 51(1): 107--113, 2008. Google ScholarDigital Library
- B. Chattopadhyay, L. Lin, W. Liu, S. Mittal, P. Aragonda, V. Lychagina, Y. Kwon, and M. Wong. Tenzing: A SQL Implementation on the MapReduce Framework. PVLDB, 4(12):1318--1327, 2011.Google Scholar
- R. Lee, et al., "YSmart: Yet Another SQL-to-MapReduce Translator," 31st International Conference on Distributed Computing Systems (Icdcs 2011), pp. 25--36, 2011. Google ScholarDigital Library
- JPA: http://www.oracle.com/technetwork/articles/javaee/jpa-137156.html.Google Scholar
- Terence Parr and Russell Quong. ANTLR: A predicated-LL(k) parser generator. Journal of Software Practice and Experience, 25(7), 1995. Google ScholarDigital Library
- Oracle Coherence: http://www.oracle.com/technetwork/middleware/coherence/overview/index.html.Google Scholar
- GigaSpaces XAP: http://www.gigaspaces.com/datagrid.Google Scholar
- VMware GemFire: http://www.vmware.com/products/application-platform/vfabric-gemfire/overview.html.Google Scholar
- Hazelcast: http://www.hazelcast.com/.Google Scholar
- Infinispan: http://www.jboss.org/infinispan/.Google Scholar
- R. Pike, S. Dorward, R. Griesemer, and S. Quinlan. Interpreting the data: Parallel analysis with Sawzall. Scientifc Programming, 13(4):277--298, 2005. Google ScholarDigital Library
- C. Olston, B. Reed, U. Srivastava, R. Kumar, and A. Tomkins. Pig Latin: a not-so-foreign language for data processing. In Proceedings of the 2008 ACM SIGMOD international conference on Management of data, pages 1099--1110. ACM, 2008. Google ScholarDigital Library
- A. Thusoo, J. S. Sarma, N. Jain, Z. Shao, P. Chakka, S. Anthony, H. Liu, P. Wycko_, and R. Murthy. Hive: a warehousing solution over a Map-Reduce framework. Proceedings of the VLDB Endowment, 2(2):1626--1629, 2009. Google ScholarDigital Library
- A. Abouzeid, K. Bajda-Pawlikowski, D. Abadi, A. Silberschatz, and A. Rasin. HadoopDB: an architectural hybrid of MapReduce and DBMS technologies for analytical workloads. Proceedings of the VLDB Endowment, 2:922--933, August 2009. Google ScholarDigital Library
- G. L. Sanders and S. K. Shin. Denormalization effects on performance of RDBMS. In Proceedings of the HICSS Conference, January 2001. Google ScholarDigital Library
- S. K. Shin and G. L. Sanders. Denormalisation strategies for data retrieval from data warehouses. Decision Support Systems, 42(1):267--282, October 2006. Google ScholarDigital Library
- Caching policy: http://en.wikipedia.org/wiki/Cache_(computing).Google Scholar
- Json: http://www.json.org/.Google Scholar
- P. P. Chen. The Entity-Relationship Model: Towards a unified view of Data. ACM Transactions on Database Systems, 1:9--36, Jan 1976. Google ScholarDigital Library
- Z. Wei, G. Pierre, and C. H. Chi. Scalable Join Queries in Cloud Data Stores. 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing. May 2012. Google ScholarDigital Library
- TPC-W: http://www.tpc.org/tpcw/default.asp.Google Scholar
- Hibernate ORM: http://www.hibernate.org/.Google Scholar
- OpenJPA: http://openjpa.apache.org/.Google Scholar
- TopLink: http://www.oracle.com/technetwork/middleware/toplink/overview/index.htmlGoogle Scholar
- M. Keith and M. Schnicariol, "Introduction Pro JPA 2," ed: Apress, 2010, pp. 1--16.Google Scholar
Index Terms
- Constructing a data accessing layer for in-memory data grid
Recommendations
Open Source In-Memory Data Grid Systems: Benchmarking Hazelcast and Infinispan
ICPE '17: Proceedings of the 8th ACM/SPEC on International Conference on Performance EngineeringDistributed cache systems are used to store and retrieve frequently used data for faster access by exploiting the memory of more than one machine, but they appear as one logical big cache. In this paper, we studied the performance of two popular open ...
In-Memory Data Grid System for Real-Time Processing of Machine Sensor Data in a Smart Factory Environment
BigDAS '15: Proceedings of the 2015 International Conference on Big Data Applications and ServicesIndustry 4.0 is aimed at setting up a smart factory, which focuses on developing base technologies such as Internet Of Things (IOT), sensor, cyber-physical system and etc. The smart factory produces process data in real time through the sensor for each ...
Comments