skip to main content
10.1145/2723372.2749428acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

ALEX: Automatic Link Exploration in Linked Data

Published:27 May 2015Publication History

ABSTRACT

There has recently been an increase in the number of RDF knowledge bases published on the Internet. These rich RDF data sets can be useful in answering many queries, but much more interesting queries can be answered by integrating information from different data sets. This has given rise to research on automatically linking different RDF data sets representing different knowledge bases. This is challenging due to their scale and semantic heterogeneity. Various approaches have been proposed, but there is room for improving the quality of the generated links.

In this paper, we present ALEX, a system that aims at improving the quality of links between RDF data sets by using feedback provided by users on the answers to linked data queries. ALEX starts with a set of candidate links obtained using any automatic linking algorithm. ALEX utilizes user feedback to discover new links that did not exist in the set of candidate links while preserving link precision. ALEX discovers these new links by finding links that are similar to a link approved by the user through feedback on queries. ALEX uses a Monte-Carlo reinforcement learning method to learn how to explore in the space of possible links around a given link. Our experiments on real-world data sets show that ALEX is efficient and significantly improves the quality of links.

References

  1. A. Aboulnaga and K. El Gebaly.boldmath μbe: User guided source selection and schema mediation for internet scale data integration. In IEEE Int. Conf. on Data Engineering (ICDE), 2007.Google ScholarGoogle Scholar
  2. M. Acosta, A. Zaveri, E. Simperl, D. Kontokostas, S. Auer, and J. Lehmann. Crowdsourcing linked data quality assessment. In Proc. Int. Semantic Web Conf. (ISWC). 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. S. Auer, C. Bizer, G. Kobilarov, J. Lehmann, R. Cyganiak, and Z. Ives. DBpedia: A nucleus for a web of open data. In Proc. Int. Semantic Web Conf. (ISWC). 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. D. Aumueller, H.-H. Do, S. Massmann, and E. Rahm. Schema and ontology matching with COMA. In Proc. ACM SIGMOD Int. Conf. on Management of Data, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. T. Berners-Lee, J. Hendler, and O. Lassila. The semantic web. Scientific American, 2001.Google ScholarGoogle ScholarCross RefCross Ref
  6. I. Bhattacharya and L. Getoor. Collective entity resolution in relational data. ACM Trans. on Knowledge Discovery from Data (TKDD), 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. C. Bizer, T. Heath, and T. Berners-Lee. Linked data-the story so far. Int. Journal on Semantic Web and Information Systems, 5(3), 2009.Google ScholarGoogle ScholarCross RefCross Ref
  8. C. Bizer, T. Heath, K. Idehen, and T. Berners-Lee. Linked data on the web. In Proc. Int. World Wide Web Conf. (WWW), 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. K. Bollacker, C. Evans, P. Paritosh, T. Sturge, and J. Taylor. Freebase: a collaboratively created graph database for structuring human knowledge. In Proc. ACM SIGMOD Int. Conf. on Management of Data, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. G. Demartini, D. E. Difallah, and P. Cudré-Mauroux. ZenCrowd: leveraging probabilistic reasoning and crowdsourcing techniques for large-scale entity linking. In Proc. Int. World Wide Web Conf. (WWW), 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. O. Etzioni, M. Cafarella, D. Downey, S. Kok, A.-M. Popescu, T. Shaked, S. Soderland, D. S. Weld, and A. Yates. Web-scale information extraction in KnowItAll (preliminary results). In Proc. Int. World Wide Web Conf. (WWW), 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. A. Ferrara, D. Lorusso, and S. Montanelli. Automatic identity recognition in the semantic web. In Proc. Int. Workshop on Identity and Reference on the Semantic Web (IRSW), 2008.Google ScholarGoogle Scholar
  13. J. Gracia, M. d'Aquin, and E. Mena. Large scale integration of senses for the semantic web. In Proc. Int. World Wide Web Conf. (WWW), 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. W. Hu, J. Chen, and Y. Qu. A self-training approach for resolving object coreference on the semantic web. In Proc. Int. World Wide Web Conf. (WWW), 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. S. R. Jeffery, M. J. Franklin, and A. Y. Halevy. Pay-as-you-go user feedback for dataspace systems. In Proc. ACM SIGMOD Int. Conf. on Management of Data, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. R. McCann, W. Shen, and A. Doan. Matching schemas in online communities: A web 2.0 approach. In Proc. IEEE Int. Conf. on Data Engineering (ICDE), 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. M. Mohri, A. Rostamizadeh, and A. Talwalkar. Foundations of Machine Learning. The MIT Press, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. B. Quilitz and U. Leser. Querying distributed RDF data sources with SPARQL. In The Semantic Web: Research and Applications. Springer, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. A. Schwarte, P. Haase, K. Hose, R. Schenkel, and M. Schmidt. FedX: Optimization techniques for federated query processing on linked data. In Proc. Int. Semantic Web Conf. (ISWC). 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. S. P. Singh and R. S. Sutton. Reinforcement learning with replacing eligibility traces. Machine Learning, 22(1--3), 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. F. M. Suchanek, S. Abiteboul, and P. Senellart. PARIS: probabilistic alignment of relations, instances, and schema. Proc. VLDB Endow. (PVLDB), 5(3), 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. R. S. Sutton and A. G. Barto. Introduction to Reinforcement Learning. MIT Press, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. J. Volz, C. Bizer, M. Gaedke, and G. Kobilarov. Silk-A link discovery framework for the web of data. In Proc. Workshop on Linked Data on the Web (LDOW), 2009.Google ScholarGoogle Scholar
  24. S. E. Whang, P. Lofgren, and H. Garcia-Molina. Question selection for crowd entity resolution. Proc. VLDB Endow. (PVLDB), 6(6), 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Z. Yan, N. Zheng, Z. G. Ives, P. P. Talukdar, and C. Yu. Actively soliciting feedback for query answers in keyword search-based data integration. Proc. VLDB Endow. (PVLDB), 6(3), 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. ALEX: Automatic Link Exploration in Linked Data

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Conferences
            SIGMOD '15: Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data
            May 2015
            2110 pages
            ISBN:9781450327589
            DOI:10.1145/2723372

            Copyright © 2015 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 27 May 2015

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article

            Acceptance Rates

            SIGMOD '15 Paper Acceptance Rate106of415submissions,26%Overall Acceptance Rate785of4,003submissions,20%

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader