skip to main content
10.1145/2588555.2593679acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

Online optimization and fair costing for dynamic data sharing in a cloud data market

Published: 18 June 2014 Publication History

Abstract

The data markets are emerging to address many organizations' need to find more useful external data sets for deeper insights. Existing data markets have two main limitations. First, they either sell a whole data set or some fixed views, but do not allow arbitrary ad-hoc queries. Second, they only sell static data, but not data that are frequently updated. While there exist proposed solutions for selling ad-hoc queries, it is an open question what mechanism should be used to sell ad-hoc queries on dynamic data. In this paper, we study a data market framework that enables the sale or sharing of dynamic data, where each sharing is specified by an ad-hoc query. To keep the shared data up-to-date, the service provider should create a view of the shared data and maintain the view for the data buyer. We provide solutions to two important problems in this framework: how to efficiently maintain the views, and how to fairly determine the cost each sharing incurs for its view to be created and maintained by the service provider. Both problems are challenging since in the first problem, different sharings with the same operations in their plans may reuse these operations, and each sharing plan must be generated online. In the second problem, there are several factors that impact the fairness of a costing function and a straightforward mechanism won't work. We propose an intuitive online algorithm for sharing plan selection, as well as a set of fair costing criteria and an algorithm that maximizes the fairness.

References

[1]
Gnip. http://gnip.com/.
[2]
Infochimps. http://infochimps.com/.
[3]
Microsoft Azure Marketplace. https://datamarket.azure.com.
[4]
Public Data Sets on AWS. http://aws.amazon.com/publicdatasets/.
[5]
Xignite. http://xignite.com/.
[6]
D. Agrawal, A. El Abbadi, A. K. Singh, and T. Yurek. Efficient View Maintenance at Data Warehouses. In SIGMOD Conference, pages 417--427, 1997.
[7]
S. Agrawal, S. Chaudhuri, and V. R. Narasayya. Automated Selection of Materialized Views and Indexes in SQL Databases. In VLDB, pages 496--505, 2000.
[8]
S. Agrawal, S. Chaudhuri, and V. R. Narasayya. Materialized View and Index Selection Tool for Microsoft SQL Server 2000. In SIGMOD Conference, page 608, 2001.
[9]
S. Al-Kiswany, H. Hacıgümüs, Z. Liu, and J. Sankaranarayanan. Cost Exploration of Data Sharings in the Cloud. In EDBT, pages 601--612, 2013.
[10]
S. Arumugam, A. Dobra, C. M. Jermaine, N. Pansare, and L. L. Perez. The DataPath System: A Data-Centric Analytic Processing Engine for Large Data Warehouses. In SIGMOD Conference, pages 519--530, 2010.
[11]
G. Candea, N. Polyzotis, and R. Vingralek. A Scalable, Predictable Join Operator for Highly Concurrent Data Warehouses. PVLDB, 2(1):277--288, 2009.
[12]
S. Chaudhuri. An Overview of Query Optimization in Relational Systems. In PODS, pages 34--43, 1998.
[13]
S. Chaudhuri, M. Datar, and V. R. Narasayya. Index Selection for Databases: A Hardness Study and a Principled Heuristic Solution. IEEE Trans. Knowl. Data Eng., 16(11):1313--1323, 2004.
[14]
S. Chaudhuri and V. R. Narasayya. An Efficient Cost-Driven Index Selection Tool for Microsoft SQL Server. In VLDB, pages 146--155, 1997.
[15]
G. Giannikis, G. Alonso, and D. Kossmann. SharedDB: Killing One Thousand Queries With One Stone. PVLDB, 5(6):526--537, 2012.
[16]
S. Harizopoulos, V. Shkapenyuk, and A. Ailamaki. QPipe: A Simultaneously Pipelined Relational Query Engine. In SIGMOD Conference, pages 383--394, 2005.
[17]
V. Kantere, D. Dash, G. Gratsias, and A. Ailamaki. Predicting Cost Amortization for Query Services. In SIGMOD Conference, pages 325--336, 2011.
[18]
D. Kossmann. The State of the Art in Distributed Query Processing. ACM Comput. Surv., 32(4):422--469, 2000.
[19]
P. Koutris, P. Upadhyaya, M. Balazinska, B. Howe, and D. Suciu. Query-Based Data Pricing. In PODS, pages 167--178, 2012.
[20]
P. Koutris, P. Upadhyaya, M. Balazinska, B. Howe, and D. Suciu. Toward Practical Query Pricing with QueryMarket. In SIGMOD Conference, pages 613--624, 2013.
[21]
C. Li, D. Y. Li, G. Miklau, and D. Suciu. A Theory of Pricing Private Data. In ICDT, pages 33--44, 2013.
[22]
H. Mistry, P. Roy, S. Sudarshan, and K. Ramamritham. Materialized View Selection and Maintenance Using Multi-Query Optimization. In SIGMOD Conference, pages 307--318, 2001.
[23]
T. Nagle, J. Hogan, and J. Zale. The Strategy and Tactics of Pricing: A Guide to Growing More Profitably. Prentice Hall, 2010.
[24]
T.-V.-A. Nguyen, S. Bimonte, L. d'Orazio, and J. Darmont. Cost Models for View Materialization in the Cloud. In EDBT/ICDT Workshops, pages 47--54, 2012.
[25]
M. T. Ozsu and P. Valduriez. Principles of Distributed Database Systems. Springer, 2011.
[26]
M. Peterson. An Introduction to Decision Theory. Cambridge University Press, 2009.
[27]
S. K. Rahimi and F. S. Haug. Distributed Database Management Systems: A Practical Approach. Wiley-IEEE Computer Society Pr, 2010.
[28]
K. A. Ross, D. Srivastava, and S. Sudarshan. Materialized View Maintenance and Integrity Constraint Checking: Trading Space for Time. In SIGMOD Conference, pages 447--458, 1996.
[29]
K. Salem, K. S. Beyer, R. Cochrane, and B. G. Lindsay. How To Roll a Join: Asynchronous Incremental View Maintenance. In SIGMOD Conference, pages 129--140, 2000.
[30]
K. Schnaitter and N. Polyzotis. Semi-Automatic Index Tuning: Keeping DBAs in the Loop. PVLDB, 5(5):478--489, 2012.
[31]
K. Schnaitter, N. Polyzotis, and L. Getoor. Index Interactions in Physical Design Tuning: Modeling, Analysis, and Applications. PVLDB, 2(1):1234--1245, 2009.
[32]
T. K. Sellis. Multiple-Query Optimization. ACM Trans. Database Syst., 13(1):23--52, 1988.
[33]
T. K. Sellis and S. Ghosh. On the Multiple-Query Optimization Problem. IEEE Trans. Knowl. Data Eng., 2(2):262--266, 1990.
[34]
K. Shim, T. K. Sellis, and D. S. Nau. Improvements on a Heuristic Algorithm for Multiple-Query Optimization. Data Knowl. Eng., 12(2):197--222, 1994.
[35]
T. J. Smith. Pricing Strategy: Setting Price Levels, Managing Price Discounts and Establishing Price Structures. Cengage Learning, 2011.
[36]
P. Upadhyaya, M. Balazinska, and D. Suciu. How to Price Shared Optimizations in the Cloud. PVLDB, 5(6):562--573, 2012.
[37]
G. Valentin, M. Zuliani, D. C. Zilio, G. M. Lohman, and A. Skelley. DB2 Advisor: An Optimizer Smart Enough to Recommend Its Own Indexes. In ICDE, pages 101--110, 2000.

Cited By

View all
  • (2024)Strategies for data supply in high-granularity data trade in smart citiesEnvironment Systems and Decisions10.1007/s10669-024-09994-745:1Online publication date: 27-Nov-2024
  • (2023)Managing Conflicting Interests of Stakeholders in Influencer MarketingProceedings of the ACM on Management of Data10.1145/35889341:1(1-27)Online publication date: 30-May-2023
  • (2022)Incentive mechanism for federated learning based on blockchain and Bayesian gameSCIENTIA SINICA Informationis10.1360/SSI-2022-002052:6(971)Online publication date: 13-Jun-2022
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SIGMOD '14: Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data
June 2014
1645 pages
ISBN:9781450323765
DOI:10.1145/2588555
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 18 June 2014

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. costing
  2. data market
  3. fairness
  4. online
  5. sharing

Qualifiers

  • Research-article

Conference

SIGMOD/PODS'14
Sponsor:

Acceptance Rates

SIGMOD '14 Paper Acceptance Rate 107 of 421 submissions, 25%;
Overall Acceptance Rate 785 of 4,003 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)17
  • Downloads (Last 6 weeks)3
Reflects downloads up to 07 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Strategies for data supply in high-granularity data trade in smart citiesEnvironment Systems and Decisions10.1007/s10669-024-09994-745:1Online publication date: 27-Nov-2024
  • (2023)Managing Conflicting Interests of Stakeholders in Influencer MarketingProceedings of the ACM on Management of Data10.1145/35889341:1(1-27)Online publication date: 30-May-2023
  • (2022)Incentive mechanism for federated learning based on blockchain and Bayesian gameSCIENTIA SINICA Informationis10.1360/SSI-2022-002052:6(971)Online publication date: 13-Jun-2022
  • (2022)A Survey on Data Pricing: From Economics to Data ScienceIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2020.304592734:10(4586-4608)Online publication date: 1-Oct-2022
  • (2021)Business Data Sharing through Data Marketplaces: A Systematic Literature ReviewJournal of Theoretical and Applied Electronic Commerce Research10.3390/jtaer1607018016:7(3321-3339)Online publication date: 3-Dec-2021
  • (2019)A Self-Controllable and Balanced Data Sharing ModelIEEE Access10.1109/ACCESS.2019.29319827(103275-103290)Online publication date: 2019
  • (2018)A Survey on Big Data Market: Pricing, Trading and ProtectionIEEE Access10.1109/ACCESS.2018.28068816(15132-15154)Online publication date: 2018

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media