skip to main content
10.1145/1242572.1242614acmconferencesArticle/Chapter ViewAbstractPublication PagesthewebconfConference Proceedingsconference-collections
Article

Globetp: template-based database replication for scalable web applications

Published: 08 May 2007 Publication History

Abstract

Generic database replication algorithms do not scale linearly in throughput as all update, deletion and insertion (UDI) queries must be applied to every database replica. The throughput is therefore limited to the point where the number of UDI queries alone is sufficient to overload one server. In such scenarios, partial replication of a database can help, as UDI queries are executed only by a subset of all servers. In this paper we propose GlobeTP, a system that employs partial replication to improve database throughput. GlobeTP exploits the fact that a Web application's query workload is composed of a small set of read and write templates. Using knowledge of these templates and their respective execution costs, GlobeTP provides database table placements that produce significant improvements in database throughput. We demonstrate the efficiency of this technique using two different industry standard benchmarks. In our experiments, GlobeTP increases the throughput by 57% to 150% compared to full replication, while using identical hardware configuration. Furthermore, adding a single query cache improves the throughput by another 30% to 60%.

References

[1]
Akamai EdgeSuite. http://www.akamai.com/en/html/services/edgesuite.html.
[2]
K. Amiri, S. Park, R. Tewari, and S. Padmanabhan. D. B. Proxy: A dynamic data cache for Web applications. In Proc. Intl. Conf. on Data Engineering, pages 821--831, March 2003.
[3]
M. Arlitt, D. Krishnamurthy, and J. Rolia. Characterizing the scalability of a large web-based shopping system. ACM Transactions on Internet Technology, 1(1):44--69, August 2001.
[4]
C. Bornhövd, M. Altinel, C. Mohan, H. Pirahesh, and B. Reinwald. Adaptive database caching with DBCache. Data Engineering, 27(2):11--18, June 2004.
[5]
E. A. Brewer. Towards robust distributed systems (abstract). Proc. ACM Symp. on Principles of Distributed Computing, July 2000.
[6]
E. Cecchet. C-JDBC: a middleware framework for database clustering. Data Engineering, 27(2):19--26, June 2004.
[7]
J. Challenger, P. Dantzig, A. Iyengar, and K. Witting. A fragment-based approach for efficiently creating dynamic web content. ACM Transactions on Internet Technologies, 5(2):359--389, May 2005.
[8]
J. Chen, G. Soundararajan, and C. Amza. Autonomic provisioning of databases in dynamic content web servers. In Proc. Intl. Conf. on Autonomic Computing, Dublin, Ireland, June 2006.
[9]
Z. Chen, Z. Huang, B. Ling, and J. Li. P2P-Join: A keyword based join operation in relational database enabled peer-to-peer systems. In Proc. Intl. Conf. on Database and Expert Systems Applications, Sept. 2006.
[10]
A. Datta, K. Dutta, H. Thomas, D. VanderMeer, Suresha, and K. Ramamritham. Proxy-based acceleration of dynamically generated content on the world wide web: an approach and implementation. In Proc. ACM SIGMOD/PODS Conf., pages 97--108, June 2002.
[11]
J. Dilley, B. Maggs, J. Parikh, H. Prokop, R. Sitaraman, and B. Weihl. Globally distributed content delivery. IEEE Internet Computing, 6(5), September-October 2002.
[12]
B. Fitzpatrick. Inside LiveJournal's backend, or "holy hell that's a lot of hits!". Presentation at the O'Reilly Open Source Convention, July 2004. http://www.danga.com/words/2004_oscon/oscon2004.pdf.
[13]
W. Fontijn and P. Boncz. AmbientDB: P2P data management middleware for ambient intelligence. In Proc. PERWARE Workshop, Mar. 2004.
[14]
M. Freedman, E. Freudenthal, and D. Mazières. Democratizing content publication with Coral. In Proc. Symp. on Networked Systems Design and Implementation, pages 239--252, San Francisco, CA, USA, March 2004.
[15]
L. Gao, M. Dahlin, A. Nayate, J. Zheng, and A. Iyengar. Application specific data replication for edge services. In Proc. Intl. WWW Conf., May 2003.
[16]
S. Gilbert and N. Lynch. Brewer's conjecture and the feasibility of consistent, available, partition-tolerant web services. ACM SIGACT News, 33(2):51--59, June 2002.
[17]
B. Kemme and G. Alonso. Don't be lazy, be consistent: Postgres-R, a new way to implement database replication. In Proc. Intl. Conf. on Very Large Data Bases, pages 134--143, Cairo, Egypt, September 2000.
[18]
W.-S. Li, OPo, W.-P. Hsiung, K. S. Candan, and D. Agrawal. Engineering and hosting adaptive freshness-sensitive web applications on data centers. In Proc. Intl. WWW Conf., pages 587--598, May 2003.
[19]
C. Olston, A. Manjhi, C. Garrod, A. Ailamaki, B. Maggs, and T. Mowry. A scalability service for dynamic web applications. In Proc. Conf. on Innovative Data Systems Research, pages 56--69, Asilomar, CA, USA, January 2005.
[20]
G. Pierre and M. van Steen. Globule: a collaborative content delivery network. IEEE Communications Magazine, 44(8):127--133, August 2006.
[21]
C. Plattner and G. Alonso. Ganymed: Scalable replication for transactional web applications. In Proc. ACM/IFIP/USENIX Intl. Middleware Conf., Toronto, Canada, October 2004.
[22]
M. Rabinovich and A. Aggarwal. RaDaR: a scalable architecture for a global web hosting service. In Proc. Intl. WWW Conf., May 1999.
[23]
M. Rabinovich, Z. Xiao, and A. Agarwal. Computing on the edge: A platform for replicating internet applications. In Proc. Intl. Workshop on Web Content Caching and Distribution, pages 57--77, Hawthorne, NY, USA, September 2003.
[24]
M. Ronstrom and L. Thalmann. MySQL cluster architecture overview. MySQL Technical White Paper, April 2004.
[25]
Rubbos: Bulletin board benchmark. http://jmob.objectweb.org/rubbos.html.
[26]
S. Sivasubramanian, G. Pierre, and M. van Steen. GlobeDB: Autonomic data replication for web applications. In Proc. Intl. WWW Conf., Chiba, Japan, May 2005.
[27]
S. Sivasubramanian, G. Pierre, M. van Steen, and G. Alonso. GlobeCBC: Content-blind result caching for dynamic web applications. Technical Report IR-CS-022, Vrije Universiteit, Amsterdam, The Netherlands, June 2006. http://www.globule.org/publi/GCBRCDWA_ircs022.html.
[28]
W.D. Smith. TPC-W: Benchmarking an ecommerce solution. White paper, Transaction Processing Performance Council.
[29]
B. Urgaonkar, G. Pacifici, P. Shenoy, M. Spreitzer, and A. Tantawi. An analytical model for multi-tier internet services and its applications. In Proc. ACM SIGMETRICS, pages 291--302, June 2005.
[30]
W. Zhao and H. Schulzrinne. Enabling on-demand query result caching in DotSlash for handling web hotspots effectively. In Proc. Workshop on Hot Topics in Web Systems and Technologies, Boston, MA, USA, November 2006.

Cited By

View all
  • (2015)Scaling Geo-replicated Databases to the MEC EnvironmentProceedings of the 2015 IEEE 34th Symposium on Reliable Distributed Systems Workshop (SRDSW)10.1109/SRDSW.2015.13(74-79)Online publication date: 28-Sep-2015
  • (2012)Server Replication in Multicast NetworksProceedings of the 2012 10th International Conference on Frontiers of Information Technology10.1109/FIT.2012.67(337-341)Online publication date: 17-Dec-2012
  • (2011)Automatic physical database tuning middleware for web-based applicationsProceedings of the 15th international conference on Advances in databases and information systems10.5555/2041746.2041781(361-374)Online publication date: 20-Sep-2011
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
WWW '07: Proceedings of the 16th international conference on World Wide Web
May 2007
1382 pages
ISBN:9781595936547
DOI:10.1145/1242572
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 08 May 2007

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. database replication
  2. partial replication
  3. scalability
  4. web applications

Qualifiers

  • Article

Conference

WWW'07
Sponsor:
WWW'07: 16th International World Wide Web Conference
May 8 - 12, 2007
Alberta, Banff, Canada

Acceptance Rates

Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)2
  • Downloads (Last 6 weeks)0
Reflects downloads up to 22 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2015)Scaling Geo-replicated Databases to the MEC EnvironmentProceedings of the 2015 IEEE 34th Symposium on Reliable Distributed Systems Workshop (SRDSW)10.1109/SRDSW.2015.13(74-79)Online publication date: 28-Sep-2015
  • (2012)Server Replication in Multicast NetworksProceedings of the 2012 10th International Conference on Frontiers of Information Technology10.1109/FIT.2012.67(337-341)Online publication date: 17-Dec-2012
  • (2011)Automatic physical database tuning middleware for web-based applicationsProceedings of the 15th international conference on Advances in databases and information systems10.5555/2041746.2041781(361-374)Online publication date: 20-Sep-2011
  • (2011)Automatic Physical Database Tuning Middleware for Web-Based ApplicationsAdvances in Databases and Information Systems10.1007/978-3-642-23737-9_26(361-374)Online publication date: 2011
  • (2009)P2P based hosting system for scalable replicated databasesProceedings of the 2009 EDBT/ICDT Workshops10.1145/1698790.1698800(47-54)Online publication date: 22-Mar-2009
  • (2009)A survey on dynamic Web content generation and delivery techniquesJournal of Network and Computer Applications10.1016/j.jnca.2009.03.00532:5(943-960)Online publication date: 1-Sep-2009
  • (2008)Scalable query result caching for web applicationsProceedings of the VLDB Endowment10.14778/1453856.14539171:1(550-561)Online publication date: 1-Aug-2008
  • (2008)Service-oriented data denormalization for scalable web applicationsProceedings of the 17th international conference on World Wide Web10.1145/1367497.1367535(267-276)Online publication date: 21-Apr-2008
  • (2008)Content Delivery and ManagementContent Delivery Networks10.1007/978-3-540-77887-5_4(105-126)Online publication date: 2008
  • (2007)Agility in virtualized utility computingProceedings of the 2nd international workshop on Virtualization technology in distributed computing10.1145/1408654.1408663(1-8)Online publication date: 12-Nov-2007
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media