skip to main content
10.1145/1065167.1065187acmconferencesArticle/Chapter ViewAbstractPublication PagespodsConference Proceedingsconference-collections
Article

Computing cores for data exchange: new algorithms and practical solutions

Published: 13 June 2005 Publication History

Abstract

Data Exchange is the problem of inserting data structured under a source schema into a target schema of different structure (possibly with integrity constraints), while reflecting the source data as accurately as possible. We study computational issues related to data exchange in the setting of Fagin, Kolaitis, and Popa(PODS'03). We use the technique of hypertree decompositions to derive improved algorithms for computing the core of a relational instance with labeled nulls, a problem we show to be fixed-parameter intractable with respect to the block size of the input instances. We show that computing the core of a data exchange problem is tractable for two large and useful classes of target constraints. The first class includes functional dependencies and weakly acyclic inclusion dependencies. The second class consists of full tuple generating dependencies and arbitrary equation generating dependencies. Finally, we show that computing cores is NP-hard in presence of a system-predicate NULL(x), which is true iff x is a null value.

References

[1]
S. Abiteboul, R. Hull, V. Vianu. Foundations of Databases, Addison-Wesley Publishing Company, 1995.
[2]
A. Aho, J. Hopcroft, and J. Ullman. Data Structures and Algorithms. Addison-Wesley, Reading MA, 1983.
[3]
A. Aho, J. Sagiv, and J. Ullman. Equivalence of Relational Expressions. SIAM J. Comput. 8:2, 1979.
[4]
A. Aho, J. Sagiv, and J. Ullman. Efficient Optimization of a Class of Relational Expressions. TODS 4:4, 1979.
[5]
M. Arenas, P. Barceló, R. Fagin, and L. Libkin, Locally Consistent Transformations and Query Answering in Data Exchange. Proc. ACM PODS 2004, pp. 229--240.
[6]
P. Atzeni and E. P. F. Chan. Independent Database Schemes under Functional and Inclusion Dependencies. Acta Informatica 28:8, pp. 777--779, 1991.
[7]
C. Beeri and M. Vardi. A Proof Procedure for Data Dependencies. J.ACM. 31:4, PP. 718--741, 1984.
[8]
M. A. Casanova, R. Fagin, and C. H. Papadimitriou. Inclusion Dependencies and Their Interaction with Functional Dependencies. JCSS 28:1, 1984.
[9]
A. K. Chandra and P. M. Merlin. Optimal Implementation of Conjunctive Queries in Relational Databases. STOC 1977.
[10]
C. Chekuri and A. Rajaraman. Conjunctive Query Containment Revisited. TCS 239:2, 2000.
[11]
R. G. Downey and M. R. Fellows. Parameterized Complexity. Springer, 1999.
[12]
R. G. Downey and M. R. Fellows and U. Tailor. The parameterized Complexity of Relational Database Queries and an Improved Characterization of W{1}. In D. S. Bridges et al., editors, Combinatorics, Complexity, and Logic -- Proc. DMTCS'96.
[13]
R. Fagin, Ph. Kolaitis, R. Miller, and L. Popa, Data Exchange: Semantics and Query Answering, ICDT'03, pp. 207--224. Full version to appear in TCS.
[14]
R. Fagin, Ph. Kolaitis, and L. Popa, Data Exchange: Getting to the Core. In Proc. PODS'03, pp. 90--101, 2003. Full version to appear in ACM TODS.
[15]
J. Flum, M. Frick, and M. Grohe. Query Evaluation via Tree-Decompositions. J. ACM49:6, 2002.
[16]
G. Gottlob. Computing Cores for Data Exchange: New Algorithms and Practical Solutions. Extended version of the present paper. Currently available at: www.dbai.tuwien.ac.at/staff/gottlob/extcore.pdf
[17]
G. Gottlob and A. Leitsch. On the efficiency of Subsumption Algorithms. J.ACM 32:2, 1985.
[18]
G. Gottlob, N. Leone, and F. Scarcello. Hypertree Decompositions and Tractable queries. Journal of Computer and System Sciences 64:3, pp. 579--627, 2002.
[19]
G. Gottlob, N. Leone, and F. Scarcello. The Complexity of Acyclic Conjunctive Queries. J.ACM 48:3, pp. 431--498, 2001.
[20]
G. Gottlob, N. Leone, and F. Scarcello. Robbers, Marshals, and Guards: Game Theoretic and Logical Characterizations of Hypertree Width. Journal of Computer and System Sciences 66(4): 775--808, 2003.
[21]
G. Gottlob, N. Leone, and F. Scarcello. A Comparison of Structural CSP Decomposition Methods. Artificial Intelligence 124:2, pp. 243--282, 2000.
[22]
G. Gottlob, R. Pichler. Hypergraphs in Model Checking: Acyclicity in Hypertree-Width versus Clique-Width. SIAM J. Comput., 33:2, 2004.
[23]
C. Gutiérrez, C. Hurtado, and A. Mendelzon. Foundations of Semantic Web Databases. ACM PODS 2004, pp. 95--106, 2004.
[24]
M. Grohe. Parameterized Complexity for the Database Theorist. ACM SIGMOD Record, 31:4, December 2002, pp. 86--96, 2002.
[25]
T. Imielinski. Abstraction in Query Processing. Journal of the ACM 38:3, pp. 534--558, 1991.
[26]
Ph. Kolaitis. Data Exchange: Schema Mappings, Data Exchange, and Metadata Management.PODS'05.
[27]
Ph. G. Kolaitis, M. Y. Vardi. Conjunctive-Query Containment and Constraint Satisfaction. Journal of Computer and System Sciences 61:2, pp. 302--332, 2000.
[28]
M. Levene, G. Loizou. How to Prevent Interaction of Functional and Inclusion Dependencies. Information Processing Letters 71:3--4, pp. 115--125, 1999.
[29]
M. Levene, G. Loizou. Guaranteeing no interaction between functional dependencies and tree-like inclusion dependencies. TCS 254:1--2, 2001.
[30]
A. Y. Levy, A. O. Mendelzon, Y. Sagiv, D. Srivastava. Answering Queries Using Views. PODS 1995. pp. 95--104, 1995.
[31]
H. Mannila and K.-J. Räihä. The Design of Relational Datases. Addison Wesley 1992.
[32]
D. Maier, A. Mendelzon, and Y. Sagiv. Testing Implication of Data Dependencies, ACM TODS 4:4, pp. 455--469, 1979.
[33]
R. Miller, L. Haas, and M. Hernández. Schema Mapping as Query Discovery. VLDB 2000.
[34]
L. Popa, Y. Velegrakis, R. Miller, M. Hernández, and R. Fagin. Translating Web Data. VLDB 2002.
[35]
C.H. Papadimitriou and M. Yannakakis. On the Complexity of Database Queries. Journal of Computers and System Sciences, 58:407--427, 1999.
[36]
X. Qian. Query Folding. ICDE 1996.
[37]
N. Robertson and P. D. Seymour. Graph Minors II. Algorithmic Aspects of Tree-Width. J. Algorithms, 7:309--322, 1986.
[38]
F. Scarcello, G. Greco, and N. Leone. Weighted Hypertree. Decompositions and Optimal Query Plans. PODS 2004.
[39]
J.D. Ullman. Principles of Database and Knowledge Base Systems, Computer Science Press, Rockville, MD, vol. I, 1988, and vol. II, 1989.
[40]
M. Yannakakis. Algorithms for Acyclic Database Schemes. VLDB 1981, pp. 82--94.

Cited By

View all
  • (2022)Answering Queries Using Views, Second EditionundefinedOnline publication date: 26-Feb-2022
  • (2021)Model Finding for ExplorationProtocols, Strands, and Logic10.1007/978-3-030-91631-2_9(156-174)Online publication date: 19-Nov-2021
  • (2020)Consistent updating of databases with marked nullsKnowledge and Information Systems10.1007/s10115-019-01402-w62:4(1571-1609)Online publication date: 1-Apr-2020
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
PODS '05: Proceedings of the twenty-fourth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
June 2005
388 pages
ISBN:1595930620
DOI:10.1145/1065167
  • General Chair:
  • Georg Gottlob,
  • Program Chair:
  • Foto Afrati
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 June 2005

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Article

Conference

SIGMOD/PODS05

Acceptance Rates

Overall Acceptance Rate 642 of 2,707 submissions, 24%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)1
  • Downloads (Last 6 weeks)0
Reflects downloads up to 09 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2022)Answering Queries Using Views, Second EditionundefinedOnline publication date: 26-Feb-2022
  • (2021)Model Finding for ExplorationProtocols, Strands, and Logic10.1007/978-3-030-91631-2_9(156-174)Online publication date: 19-Nov-2021
  • (2020)Consistent updating of databases with marked nullsKnowledge and Information Systems10.1007/s10115-019-01402-w62:4(1571-1609)Online publication date: 1-Apr-2020
  • (2018)Reflections on Schema Mappings, Data Exchange, and Metadata ManagementProceedings of the 37th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems10.1145/3196959.3196991(107-109)Online publication date: 27-May-2018
  • (2018)Data ExchangeEncyclopedia of Database Systems10.1007/978-1-4614-8265-9_960(755-762)Online publication date: 7-Dec-2018
  • (2017)Canonical Forms for Isomorphic and Equivalent RDF GraphsACM Transactions on the Web10.1145/306833311:4(1-62)Online publication date: 25-Jul-2017
  • (2016)Data ExchangeEncyclopedia of Database Systems10.1007/978-1-4899-7993-3_960-2(1-9)Online publication date: 29-Dec-2016
  • (2015)Many-one reductions and the category of multivalued functionsMathematical Structures in Computer Science10.1017/S096012951500026227:3(376-404)Online publication date: 3-Jul-2015
  • (2014)Optimizing the chaseProceedings of the VLDB Endowment10.14778/2733085.27330937:14(1869-1880)Online publication date: 1-Oct-2014
  • (2011)Foundations of Semantic Web databasesJournal of Computer and System Sciences10.1016/j.jcss.2010.04.00977:3(520-541)Online publication date: 1-May-2011
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media