skip to main content
10.1145/3035918.3064028acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

Interactive Mapping Specification with Exemplar Tuples

Published:09 May 2017Publication History

ABSTRACT

While schema mapping specification is a cumbersome task for data curation specialists, it becomes unfeasible for non-expert users, who are unacquainted with the semantics and languages of the involved transformations.

In this paper, we present an interactive framework for schema mapping specification suited for non-expert users. The underlying key intuition is to leverage a few exemplar tuples to infer the underlying mappings and iterate the inference process via simple user interactions under the form of boolean queries on the validity of the initial exemplar tuples. The approaches available so far are mainly assuming pairs of complete universal data examples, which can be solely provided by data curation experts, or are limited to poorly expressive mappings.

We present several exploration strategies of the space of all possible mappings that satisfy arbitrary user exemplar tuples. Along the exploration, we challenge the user to retain the mappings that fit the user's requirements at best and to dynamically prune the exploration space, thus reducing the number of user interactions. We prove that after the refinement process, the obtained mappings are correct. We present an extensive experimental analysis devoted to measure the feasibility of our interactive mapping strategies and the inherent quality of the obtained mappings.

References

  1. A. Abouzied, D. Angluin, C. H. Papadimitriou, J. M. Hellerstein, and A. Silberschatz. Learning and verifying quantified boolean queries by example. In Proceedings of PODS, pages 49--60, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. A. Abouzied, J. M. Hellerstein, and A. Silberschatz. Playful query specification with dataplay. PVLDB, 5(12):1938--1941, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. R. Agrawal and R. Srikant. Fast algorithms for mining association rules in large databases. In VLDB'94, pages 487--499, 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. B. Alexe, B. T. Cate, P. G. Kolaitis, and W.-C. Tan. Characterizing schema mappings via data examples. TODS, 36(4):23:1--23:48, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. B. Alexe, L. Chiticariu, R. J. Miller, and W. C. Tan. Muse: Mapping understanding and design by example. In Proceedings of the ICDE, pages 10--19, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. B. Alexe, B. ten Cate, P. G. Kolaitis, and W. C. Tan. Designing and refining schema mappings via data examples. In Proceedings of SIGMOD, pages 133--144, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. B. Alexe, B. Ten Cate, P. G. Kolaitis, and W.-C. Tan. Eirene: Interactive design and refinement of schema mappings via data examples. Proceedings of VLDB, 2011.Google ScholarGoogle Scholar
  8. D. Angluin. Queries and concept learning. Machine Learning, 2(4):319--342, 1987. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. P. C. Arocena, B. Glavic, R. Ciucanu, and R. J. Miller. The ibench integration metadata generator. Proceedings of VLDB, 9(3):108--119, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. C. Beeri and M. Y. Vardi. A proof procedure for data dependencies. JACM, 31(4):718--741, 1984. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Z. Bellahsene, A. Bonifati, and E. Rahm, editors. Schema Matching and Mapping. Data-Centric Systems and Applications. Springer, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. P. A. Bernstein and S. Melnik. Model management 2.0: Manipulating richer mappings. In SIGMOD, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. A. Bonifati, R. Ciucanu, and S. Staworko. Learning join queries from user examples. ACM Trans. Database Syst., 40(4):24:1--24:38, Jan. 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. B. T. Cate, V. Dalmau, and P. G. Kolaitis. Learning schema mappings. ACM TODS, 38(4):28, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. L. Chiticariu and W.-C. Tan. Debugging schema mappings with routes. In Proceedings of the 32nd international conference on Very large data bases, pages 79--90. VLDB Endowment, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. G. I. Diaz, M. Arenas, and M. Benedikt. Sparqlbye: Querying RDF data by example. PVLDB, 9(13):1533--1536, 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. R. Fagin, P. G. Kolaitis, R. J. Miller, and L. Popa. Data exchange: semantics and query answering. Theoretical Computer Science, 336(1):89--124, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. M. J. Franklin, A. Y. Halevy, and D. Maier. A first tutorial on dataspaces. PVLDB, 1(2):1516--1517, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. B. Glavic, G. Alonso, R. J. Miller, and L. M. Haas. Tramp Understanding the behavior of schema mappings through provenance. Proc. VLDB Endow., 3(1--2):1314--1325, Sept. 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. B. Glavic, J. Du, R. J. Miller, G. Alonso, and L. M. Haas. Debugging data exchange with vagabond. PVLDB, 4(12):1383--1386, 2011.Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. G. Gottlob, R. Pichler, and V. Savenkov. Normalization and optimization of schema mappings. VLDB J., 20(2):277--302, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. G. Gottlob and P. Senellart. Schema mapping discovery from data instances. Journal of the ACM(JACM), 57(2):6, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. H. V. Jagadish, A. Chapman, A. Elkiss, M. Jayapandian, Y. Li, A. Nandi, and C. Yu. Making database systems usable. In Proceedings of SIGMOD, pages 13--24, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. D. Mottin, M. Lissandrini, Y. Velegrakis, and T. Palpanas. Exemplar queries: Give me an example of what you need. PVLDB, 7(5):365--376, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. L. Popa, Y. Velegrakis, M. A. Hernández, R. J. Miller, and R. Fagin. Translating web data. In Proceedings of VLDB, pages 598--609, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. L. Qian, M. J. Cafarella, and H. Jagadish. Sample-driven schem mapping. In Proceedings of SIGMOD, pages 73--84. ACM, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. P. Shvaiko and J. Euzenat. A survey of schema-based matching approaches. Journal on Data Semantics, pages 146--171, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. B. ten Cate, P. G. Kolaitis, K. Qian, and W.-C. Tan. Approximation algorithms for schema-mapping discovery from data examples. In Alberto Mendelzon International Workshop on Foundations of Data Management, page 24, 2015.Google ScholarGoogle Scholar
  29. B. Ten Cate, P. G. Kolaitis, and W.-C. Tan. Database constraints and homomorphism dualities. In CP. Springer, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. L. G. Valiant. A theory of the learnable. Commun. ACM, 27(11):1134--1142, Nov. 1984. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. L. Yan, R. J. Miller, L. M. Haas, and R. Fagin. Data-drive understanding and refinement of schema mappings. In Proceedings of SIGMOD, pages 485--496, 2001 Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Interactive Mapping Specification with Exemplar Tuples

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      SIGMOD '17: Proceedings of the 2017 ACM International Conference on Management of Data
      May 2017
      1810 pages
      ISBN:9781450341974
      DOI:10.1145/3035918

      Copyright © 2017 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 9 May 2017

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate785of4,003submissions,20%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader