skip to main content
10.1145/2463676.2465311acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

Value invention in data exchange

Published: 22 June 2013 Publication History

Abstract

The creation of values to represent incomplete information, often referred to as value invention, is central in data exchange. Within schema mappings, Skolem functions have long been used for value invention as they permit a precise representation of missing information. Recent work on a powerful mapping language called second-order tuple generating dependencies (SO tgds), has drawn attention to the fact that the use of arbitrary Skolem functions can have negative computational and programmatic properties in data exchange. In this paper, we present two techniques for understanding when the Skolem functions needed to represent the correct semantics of incomplete information are computationally well-behaved. Specifically, we consider when the Skolem functions in second-order (SO) mappings have a first-order (FO) semantics and are therefore programmatically and computationally more desirable for use in practice. Our first technique, linearization, significantly extends the Nash, Bernstein and Melnik unskolemization algorithm, by understanding when the sets of arguments of the Skolem functions in a mapping are related by set inclusion. We show that such a linear relationship leads to mappings that have FO semantics and are expressible in popular mapping languages including source-to-target tgds and nested tgds. Our second technique uses source semantics, specifically functional dependencies (including keys), to transform SO mappings into equivalent FO mappings. We show that our algorithms are applicable to a strictly larger class of mappings than previous approaches, but more importantly we present an extensive experimental evaluation that quantifies this difference (about 78% improvement) over an extensive schema mapping benchmark and illustrates the applicability of our results on real mappings.

References

[1]
B. Alexe, M. A. Hernández, L. Popa, and W. C. Tan. MapMerge: Correlating Independent Schema Mappings. VLDB J., 21(2):191--211, 2012.
[2]
B. Alexe, W. C. Tan, and Y. Velegrakis. STBenchmark: Towards a Benchmark for Mapping Systems. PVLDB, 1(1):230--244, 2008.
[3]
B. Alexe, B. ten Cate, P. Kolaitis, and W. Tan. Designing and refining schema mappings via data examples. In SIGMOD, pages 133--144, 2011.
[4]
Y. An, A. Borgida, R. J. Miller, and J. Mylopoulos. A Semantic Approach to Discovering Schema Mapping Expressions. In ICDE, pages 206--215, 2007.
[5]
M. Arenas, R. Fagin, and A. Nash. Composition with Target Constraints. Logical Methods in Comput. Sci., 7(3), 2011.
[6]
M. Arenas, J. Pérez, J. Reutter, and C. Riveros. Inverting Schema Mappings: Bridging the Gap between Theory and Practice. PVLDB, 2(1):1018--1029, 2009.
[7]
P. C. Arocena, M. D'Angelo, B. Glavic, and R. J. Miller. STBenchmark 2.0. Technical report, University of Toronto, 2013. http://dblab.cs.toronto.edu/project/STBench2.0.
[8]
P. A. Bernstein, T. J. Green, S. Melnik, and A. Nash. Implementing Mapping Composition. VLDB J., 17(2):333--353, 2008.
[9]
P. A. Bernstein, A. Y. Halevy, and R. A. Pottinger. A Vision for Management of Complex Models. SIGMOD Record, 29(4):55--63, Dec. 2000.
[10]
H. B. Enderton. A Mathematical Introduction to Logic. Harcout Academic Press, 2nd. Edition, 2001.
[11]
R. Fagin, P. Kolaitis, R. J. Miller, and L. Popa. Data Exchange: Semantics and Query Answering. Theor. Comput. Sci., 336(1):89--124, 2005.
[12]
R. Fagin, P. Kolaitis, L. Popa, and W. C. Tan. Composing Schema Mappings: Second-Order Dependencies to the Rescue. TODS, 30(4):994--1055, 2005.
[13]
I. Feinerer, R. Pichler, E. Sallinger, and V. Savenkov. On the Undecidability of the Equivalence of Second-Order Tuple Generating Dependencies. In AMW, 2011.
[14]
A. Fuxman, M. A. Hernandez, H. Ho, R. J. Miller, P. Papotti, and L. Popa. Nested Mappings: Schema Mapping Reloaded. In VLDB, pages 67--78, 2006.
[15]
A. Fuxman, P. Kolaitis, R. J. Miller, and W. C. Tan. Peer Data Exchange. TODS, 31(4):1454--1498, 2006.
[16]
D. Gabbay, R. Schmidt, and A. Szalas. Second Order Quantifier Elimination: Foundations, Computational Aspects and Applications. College Publications, 2008.
[17]
T. J. Green, G. Karvounarakis, N. E. Taylor, O. Biton, Z. G. Ives, and V. Tannen. ORCHESTRA: Facilitating Collaborative Data Sharing. In SIGMOD, pages 1131--1133, 2007.
[18]
R. Hull and M. Yoshikawa. ILOG: Declarative Creation and Manipulation of Object Identifiers. In VLDB, pages 455--468, 1990.
[19]
A. Klug and R. Price. Determining View Dependencies Using Tableaux. TODS, 7(3):361--380, 1982.
[20]
M. K. Lawrence, R. A. Pottinger, and S. Staub-French. Data Coordination: Supporting Contingent Updates. PVLDB, 4(11):831--842, 2011.
[21]
M. Lenzerini. Data Integration: a Theoretical Perspective. In PODS, pages 233--246, 2002.
[22]
L. Libkin and C. Sirangelo. Data Exchange and Schema Mappings in Open and Closed Worlds. J. Comput. Syst. Sci., 77(3):542--571, 2011.
[23]
B. Marnette, G. Mecca, and P. Papotti. Scalable Data Exchange with Functional Dependencies. PVLDB, 3(1):105--116, 2010.
[24]
B. Marnette, G. Mecca, P. Papotti, S. Raunich, and D. Santoro. ++Spicy: an OpenSource Tool for Second-Generation Schema Mapping and Data Exchange. PVLDB, 4(12):1438--1441, 2011.
[25]
R. J. Miller, D. Fisla, M. Huang, D. Kymlicka, F. Ku, and V. Lee. The Amalgam Schema and Data Integration Test Suite. www.cs.toronto.edu/~miller/amalgam, 2001.
[26]
A. Nash, P. A. Bernstein, and S. Melnik. Composition of Mappings Given by Embedded Dependencies. TODS, 32(1):4, 2007.
[27]
Y. Papakonstantinou, S. Abiteboul, and H. Garcia-Molina. Object Fusion in Mediator Systems. In VLDB, pages 413--424, 1996.
[28]
R. Pichler and S. Skritek. The Complexity of Evaluating Tuple Generating Dependencies. In ICDT, pages 244--255, 2011.
[29]
L. Popa, Y. Velegrakis, R. J. Miller, M. A. Hernandez, and R. Fagin. Translating Web Data. In VLDB, pages 598--609, 2002.
[30]
L. Seligman, P. Mork, A. Y. Halevy, K. P. Smith, M. J. Carey, K. Chen, C. Wolf, J. Madhavan, A. Kannan, and D. Burdick. OpenII: an Open Source Information Integration Toolkit. In SIGMOD, pages 1057--1060, 2010.
[31]
B. ten Cate and P. Kolaitis. Structural Characterizations of Schema-Mapping Languages. In ICDT, pages 63--72, 2009.
[32]
C. Yu and L. Popa. Semantic Adaptation of Schema Mappings when Schemas Evolve. VDLB, pages 1006--1017, 2005.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SIGMOD '13: Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
June 2013
1322 pages
ISBN:9781450320375
DOI:10.1145/2463676
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 22 June 2013

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. data exchange
  2. linearization
  3. schema mappings
  4. skolemization
  5. unskolemization
  6. value invention

Qualifiers

  • Research-article

Conference

SIGMOD/PODS'13
Sponsor:

Acceptance Rates

SIGMOD '13 Paper Acceptance Rate 76 of 372 submissions, 20%;
Overall Acceptance Rate 785 of 4,003 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)10
  • Downloads (Last 6 weeks)3
Reflects downloads up to 20 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2022)Tools for Activating Data Marketplace (2)Tools for Activating Data Marketplace10.1007/978-3-031-06145-5_4(85-142)Online publication date: 1-Dec-2022
  • (2020)Data migration using datalog program synthesisProceedings of the VLDB Endowment10.14778/3384345.338435013:7(1006-1019)Online publication date: 26-Mar-2020
  • (2019)Rewriting of plain SO tgds into nested tgdsProceedings of the VLDB Endowment10.14778/3342263.334263112:11(1526-1538)Online publication date: 1-Jul-2019
  • (2018)Reflections on Schema Mappings, Data Exchange, and Metadata ManagementProceedings of the 37th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems10.1145/3196959.3196991(107-109)Online publication date: 27-May-2018
  • (2017)J-LogicProceedings of the 36th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems10.1145/3034786.3056106(137-149)Online publication date: 9-May-2017
  • (2017)Drawing Density Core-Sets from Incomplete Relational DataDatabase Systems for Advanced Applications10.1007/978-3-319-55699-4_32(527-542)Online publication date: 22-Mar-2017
  • (2016)Determining the Real Data Completeness of a Relational DatasetJournal of Computer Science and Technology10.1007/s11390-016-1659-x31:4(720-740)Online publication date: 8-Jul-2016
  • (2016)Mapping-equivalence and oid-equivalence of single-function object-creating conjunctive queriesThe VLDB Journal — The International Journal on Very Large Data Bases10.1007/s00778-016-0421-x25:3(381-397)Online publication date: 1-Jun-2016
  • (2015)The iBench integration metadata generatorProceedings of the VLDB Endowment10.14778/2850583.28505869:3(108-119)Online publication date: 1-Nov-2015
  • (2015)On the undecidability of the equivalence of second-order tuple generating dependenciesInformation Systems10.1016/j.is.2014.09.00348:C(113-129)Online publication date: 1-Mar-2015
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media