ABSTRACT
This paper proposes a new formula protocol for distributed concurrency control, and specifies a staged grid architecture for highly scalable database management systems. The paper also describes novel implementation techniques of Rubato DB based on the proposed protocol and architecture. We have conducted extensive experiments which clearly show that Rubato DB is highly scalable with efficient performance under both TPC-C and YCSB benchmarks. Our paper verifies that the formula protocol and the staged grid architecture provide a satisfactory solution to one of the important challenges in the database systems: to develop a highly scalable database management system that supports various consistency levels from ACID to BASE.
- A. Abouzeid and etc. Hadoopdb: an architectural hybrid of mapreduce and dbms technologies for analytical workloads. In PVLDB, pages 922--933, 2009. Google ScholarDigital Library
- M. K. Aguilera, A. Merchant, and etc. Sinfonia: A new paradigm for building scalable distributed systems. volume 27, pages 5:1--5:48. Google ScholarDigital Library
- P. Alvaro, N. Conway, J. Hellerstein, and W. R. Marczak. Consistency analysis in bloom: a calm and collected approach. In CIDR, pages 249--260, 2011.Google Scholar
- J. Baker, C. Bond, J. Corbett, and etc. Megastore: Providing scalable, highly available storage for interactive services. In CIDR, pages 223--234, 2011.Google Scholar
- S. Blott and H. F. Korth. An almost-serial protocol for transaction execution in main-memory database systems. In PVLDB, pages 706--717, 2002. Google ScholarDigital Library
- M. Brantner, D. Florescu, and etc. Building a database on s3. In SIGMOD, pages 251--264, 2008. Google ScholarDigital Library
- R. Chaiken, B. Jenkins, and etc. Scope: easy and efficient parallel processing of massive data sets. Proc. VLDB Endow., 1(2):1265--1276, Aug. 2008. Google ScholarDigital Library
- F. Chang, J. Dean, S. Ghemawat, and etc. Bigtable: A distributed storage system for structured data. In ACM TOCS, volume 26, pages 1--26, 2008. Google ScholarDigital Library
- J. Cohen, B. Dolan, and etc. Mad skills: new analysis practices for big data. PVLDB, 2(2):1481--1492, 2009. Google ScholarDigital Library
- B. F. Cooper and etc. Pnuts: Yahoo!'s hosted data serving platform. In VLDB, pages 1277--1288, 2008. Google ScholarDigital Library
- B. F. Cooper and etc. Benchmarking cloud serving systems with ycsb. In SoCC, pages 143--154, 2010. Google ScholarDigital Library
- J. C. Corbett and etc. Spanner: Google's globally-distributed database. In OSDI, pages 251--264, 2012. Google ScholarDigital Library
- J. Dean and S. Ghemawat. Mapreduce: Simplified data processing on large clusters. In Commun. of ACM, volume 51, pages 107--113, 2008. Google ScholarDigital Library
- G. DeCandia, D. Hastorun, and etc. Dynamo: Amazon's highly available key-value store. In SOSP, pages 205--220, 2007. Google ScholarDigital Library
- A. Fekete, D. Liarokapis, and etc. Making snapshot isolation serializable. In TODS, volume 30, pages 492--528, 2005. Google ScholarDigital Library
- S. Harizopoulos and A. Ailamaki. A case for staged database systems. In CIDR, 2003.Google Scholar
- M. Isard, M. Budiu, and etc. Dryad: distributed data-parallel programs from sequential building blocks. ACM SIGOPS, 41(3):59--72, 2007. Google ScholarDigital Library
- R. Kallman and etc. H-store: a high-performance, distributed main memory transaction processing system. In VLDB, pages 1496--1499, 2008. Google ScholarDigital Library
- A. Lakshman. Cassandra: a decentralized structured storage system. In SIGOPS, pages 35--40, 2010. Google ScholarDigital Library
- P.-A. Larson, S. Blanas, and etc. High-performance concurrency control mechanisms for main-memory databases. In PVLDB, volume 5, pages 298--309, 2011. Google ScholarDigital Library
- K. Manassiev and etc. Exploiting distributed version concurrency in a transactional memory cluster. In ACM SIGPLAN, pages 198--208. ACM, 2006. Google ScholarDigital Library
- S. Melnik and etc. Dremel: interactive analysis of web-scale datasets. PVLDB, 3(1-2):330--339, 2010. Google ScholarDigital Library
- J. Rao and etc. Using paxos to build a scalable, consistent, and highly available datastore. volume 4, pages 243--254. VLDB Endowment, 2011. Google ScholarDigital Library
- M. Stonebraker and R. Cattell. 10 rules for scalable performance in 'simple operation' datastores. Commun. ACM, 54(6):72--80, June 2011. Google ScholarDigital Library
- M. Stonebraker, S. Madden, and etc. The end of an architectural era: (it's time for a complete rewrite). In VLDB, pages 1150--1160, 2007. Google ScholarDigital Library
- R. Thomas. A solution to the concurrency control problem for multiple copy databases. In Digest of papers IEEE COMPCON spring, pages 56--62, 1984.Google Scholar
- A. Thomson, T. Diamond, and etc. Calvin: fast distributed transactions for partitioned database systems. In SIGMOD, pages 1--12, 2012. Google ScholarDigital Library
- A. Thusoo, J. S. Sarma, and etc. Hive: a warehousing solution over a map-reduce framework. volume 2, pages 1626--1629. VLDB Endowment, Aug. 2009. Google ScholarDigital Library
- TPC-C. http://www.tpc.org/tpcc/. 2010.Google Scholar
- VoltDB. http://voltdb.com/products/technology.Google Scholar
- M. Welsh, D. Culler, and E. Brewer. Seda: an architecture for well-conditioned, scalable internet services. In SOSP, pages 230--243, 2001. Google ScholarDigital Library
Index Terms
- Rubato DB: A Highly Scalable Staged Grid Database System for OLTP and Big Data Applications
Recommendations
A Demonstration of Rubato DB: A Highly Scalable NewSQL Database System for OLTP and Big Data Applications
SIGMOD '15: Proceedings of the 2015 ACM SIGMOD International Conference on Management of DataWe propose to demonstrate Rubato DB, a highly scalable NewSQL system, supporting various consistency levels from ACID to BASE for OLTP and big data applications. Rubato DB employs the staged grid architecture with a novel formula based protocol for ...
BASE analysis of NoSQL database
NoSQL databases are designed to address performance and scalability requirements of web based application which cannot be addressed by traditional relational databases. Due to their contrast in priorities and architecture to conventional relational ...
OLE DB: A Component DBMS Architecture
ICDE '96: Proceedings of the Twelfth International Conference on Data EngineeringThe article describes an effort at Microsoft whose primary goal is to enable applications to have uniform access to data stored in diverse DBMS and non DBMS information containers. Applications continue to take advantage of the benefits of database ...
Comments