skip to main content
10.1145/1772690.1772707acmotherconferencesArticle/Chapter ViewAbstractPublication PagesthewebconfConference Proceedingsconference-collections
research-article

Relational duality: unsupervised extraction of semantic relations between entities on the web

Published: 26 April 2010 Publication History

Abstract

Extracting semantic relations among entities is an important first step in various tasks in Web mining and natural language processing such as information extraction, relation detection, and social network mining. A relation can be expressed extensionally by stating all the instances of that relation or intensionally by defining all the paraphrases of that relation. For example, consider the ACQUISITION relation between two companies. An extensional definition of ACQUISITION contains all pairs of companies in which one company is acquired by another (e.g. (YouTube, Google) or (Powerset, Microsoft)). On the other hand we can intensionally define ACQUISITION as the relation described by lexical patterns such as X is acquired by Y, or Y purchased X, where X and Y denote two companies. We use this dual representation of semantic relations to propose a novel sequential co-clustering algorithm that can extract numerous relations efficiently from unlabeled data. We provide an efficient heuristic to find the parameters of the proposed coclustering algorithm. Using the clusters produced by the algorithm, we train an L1 regularized logistic regression model to identify the representative patterns that describe the relation expressed by each cluster. We evaluate the proposed method in three different tasks: measuring relational similarity between entity pairs, open information extraction (Open IE), and classifying relations in a social network system. Experiments conducted using a benchmark dataset show that the proposed method improves existing relational similarity measures. Moreover, the proposed method significantly outperforms the current state-of-the-art Open IE systems in terms of both precision and recall. The proposed method correctly classifies 53 relation types in an online social network containing 470; 671 nodes and 35; 652; 475 edges, thereby demonstrating its efficacy in real-world relation detection tasks.

References

[1]
E. Agichtein and L. Gravano. Snowball: Extracting relations from large plain-text collections. In ICDL'00, 2000.
[2]
A. Anagnostopoulos, A. Dasgupta, and R. Kumar. Approximation algorithms for co-clustering. In PODS '08, pages 201--210, 2008.
[3]
A. Banerjee, I. Dhillon, J. Ghosh, S. Merugu, and D. S. Modha. A generalized maximum entropy approach to bregman co-clustering and matrix approximation. JAIR, pages 1919--?1986, 2007.
[4]
M. Banko, M. Cafarella, S. Soderland, M. Broadhead, and O. Etzioni. Open information extraction from the web. In IJCAI'07, pages 2670--2676, 2007.
[5]
M. Banko and O. Etzioni. The tradeoffs between traditional and open relation extraction. In ACL'08, pages 28--36, 2008.
[6]
M. Berland and E. Charniak. Finding parts in very large corpora. In ACL'99, pages 57--64, 1999.
[7]
R. Bhagat and D. Ravichandran. Large scale acquisition of paraphrases for learning surface patterns. In ACL'08, pages 674--682, 2008.
[8]
D. Bollegala, Y. Matsuo, and M. Ishizuka. Measuring the similarity between implicit semantic relations from the web. In WWW'09, pages 651--660, 2009.
[9]
S. Brin. Extracting patterns and relations from the world wide web. In WebDB Workshop at EDBT'98, pages 172 -- 183, 1998.
[10]
R. Bunescu and R. Mooney. Subsequence kernels for relation extraction. In NIPS'06, pages 171--178, 2006.
[11]
H. Cho, I. Dhillon, Y. Guan, and S. Sra. Minimum sum-squared residue co-clustering of gene expression data. In fourth SIAM Intl. Conf. on Data Mining, pages 114--125, 2004.
[12]
I. Copi. Introduction to Logic. Prentice Hall College Div, 1998.
[13]
A. Culotta, A. McCallum, and J. Betz. Integrating probabilistic extraction models and data mining to discover relations and patterns in text. In HLT/NAACL'06, pages 296--303, 2006.
[14]
I. Dhillon, S. Mallela, and D. Modha. Information-theoretic co-clustering. In KDD'01, pages 89--98, 2003.
[15]
O. Etzioni, M. Cafarella, D. Downey, A. Popescu, T. Shaked, S. Soderl, D. S. Weld, and E. Yates. Unsupervised named-entity extraction from the web: An experimental study. Artificial Intelligence, 165:91--134, 2005.
[16]
C. Giuliano, A. Lavelli, and L. Romano. Exploiting shallow linguistic information for relation extraction from biomedical literature. In EACL'06, pages 401--408, 2006.
[17]
Q. Gu and J. Zhou. Co-clustering on manifolds. In Proc. of KDD'09, pages 359--367, 2009.
[18]
A. Harabagiu, C. A. Bejan, and P. Morarescu. Shallow semantics for relation extraction. In IJCAI'05, pages 1061--1066, 2005.
[19]
Z. Harris. Distributional structure. Word, 10:146--162, 1954.
[20]
M. Hearst. Automatic acquisition of hyponyms from large text corpora. In COLING'92, pages 539--545, 1992.
[21]
D. Lin and P. Pantel. Dirt: Discovery of inference rules from text. In SIGKDD'01, pages 323--328, 2001.
[22]
C. D. Manning and H. Schütze. Foundations of Statistical Natural Language Processing. The MIT Press, Cambridge, Massachusetts, 2002.
[23]
Y. Matsuo, J. Mori, M. Hamasaki, K. Ishida, T. Nishimura, H. Takeda, K. Hasida, and M. Ishizuka. Polyphonet: An advanced social network extraction system. In WWW'06, 2006.
[24]
A. Y. Ng. Feature selection, l1 vs. l2 regularization, and rotational invariance. In ICML'04, pages 78--85, 2004.
[25]
M. Pasca, D. Lin, J. Bigham, A. Lifchits, and A. Jain. Organizing and searching the world wide web of facts - step one: the one-million fact extraction challenge. In AAAI'06, pages 1400--1405, 2006.
[26]
J. Pei, J. Han, B. Mortazavi-Asi, J. Wang, H. Pinto, Q. Chen, U. Dayal, and M. Hsu. Mining sequential patterns by pattern-growth: the prefixspan approach. IEEE TKDE, 16(11):1424--1440, 2004.
[27]
D. Ravichandran and E. Hovy. Learning surface text patterns for a question answering system. In ACL '02, pages 41--47, 2001.
[28]
D. Roth and W. Yih. A linear programming formulation for global inference in natural language tasks. In CoNLL'04, pages 1--8, 2004.
[29]
G. Salton and C. Buckley. Introduction to Modern Information Retreival. McGraw-Hill Book Company, 1983.
[30]
S. Sarawagi and A. Kirpal. Efficient set joins on similarity predicates. In SIGMOD '04, pages 743--754, 2004.
[31]
Y. Shinyama and S. Sekine. Preemptive information extraction using unrestricted relation discovery. In HLT-NAACL'06, pages 304--311, 2006.
[32]
R. Snow, D. Jurafsky, and A. Ng. Learning syntactic patterns for automatic hypernym discovery. In NIPS'05, pages 1297--1304, 2005.
[33]
P. Turney. Measuring semantic similarity by latent relational analysis. In IJCAI'05, pages 1136--1141, 2005.
[34]
P. Turney. Expressing implicit semantic relations without supervision. In COLING-ACL'06, pages 313--320, 2006.
[35]
T. Veale. Wordnet sits the sat: A knowledge-based approach to lexical analogy. In ECAI'04, pages 606--612, 2004.
[36]
D. Zelenko, C. Aone, and A. Richardella. Kernel methods for relation extraction. JMLR, 3:1083--1106, 2003.
[37]
G. Zhou, M. Zhang, D. H. Ji, and Q. Zhu. Tree kernel-based relation extraction with context-sensitive structured parse tree information. In EMNLP-CoNLL, pages 728 -- 736, 2005.
[38]
J. Zhu, Z. Nie, X. Liu, B. Zhang, and J. R. Wen. Statsnowball: a statistical approach to extracting entity relationships. In WWW'09, pages 101--110. ACM, 2009.

Cited By

View all
  • (2024)Graph-based Named Entity Information Retrieval from News Articles using Neo4j2024 11th International Conference on Computing for Sustainable Global Development (INDIACom)10.23919/INDIACom61295.2024.10499187(320-324)Online publication date: 28-Feb-2024
  • (2024)SCL: Selective Contrastive Learning for Data-driven Zero-shot Relation ExtractionTransactions of the Association for Computational Linguistics10.1162/tacl_a_0072112(1720-1735)Online publication date: 23-Dec-2024
  • (2024)Few-Shot Relation Extraction With Dual Graph Neural Network InteractionIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2023.327893835:10(14396-14408)Online publication date: Oct-2024
  • Show More Cited By

Index Terms

  1. Relational duality: unsupervised extraction of semantic relations between entities on the web

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Other conferences
      WWW '10: Proceedings of the 19th international conference on World wide web
      April 2010
      1407 pages
      ISBN:9781605587998
      DOI:10.1145/1772690

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 26 April 2010

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. relation extraction
      2. relational duality
      3. relational similarity
      4. web mining

      Qualifiers

      • Research-article

      Conference

      WWW '10
      WWW '10: The 19th International World Wide Web Conference
      April 26 - 30, 2010
      North Carolina, Raleigh, USA

      Acceptance Rates

      Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)9
      • Downloads (Last 6 weeks)1
      Reflects downloads up to 13 Feb 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)Graph-based Named Entity Information Retrieval from News Articles using Neo4j2024 11th International Conference on Computing for Sustainable Global Development (INDIACom)10.23919/INDIACom61295.2024.10499187(320-324)Online publication date: 28-Feb-2024
      • (2024)SCL: Selective Contrastive Learning for Data-driven Zero-shot Relation ExtractionTransactions of the Association for Computational Linguistics10.1162/tacl_a_0072112(1720-1735)Online publication date: 23-Dec-2024
      • (2024)Few-Shot Relation Extraction With Dual Graph Neural Network InteractionIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2023.327893835:10(14396-14408)Online publication date: Oct-2024
      • (2023)Temporal Relationship Extraction of Conflict Events in Open Source Military Journalism2023 2nd International Conference on Artificial Intelligence and Computer Information Technology (AICIT)10.1109/AICIT59054.2023.10277795(1-4)Online publication date: 15-Sep-2023
      • (2023)SCRE: special cargo relation extraction using representation learningNeural Computing and Applications10.1007/s00521-023-08704-935:25(18783-18801)Online publication date: 18-Jun-2023
      • (2022)Discovering Fine-Grained Semantics in Knowledge Graph RelationsProceedings of the 31st ACM International Conference on Information & Knowledge Management10.1145/3511808.3557287(822-831)Online publication date: 17-Oct-2022
      • (2022)A machine learning approach to extracting spatial information from geological texts in ChineseInternational Journal of Geographical Information Science10.1080/13658816.2022.208722436:11(2169-2193)Online publication date: 15-Jun-2022
      • (2022)An Application of Knowledge Graph for Enterprise Risk PredictionProceedings of the 12th International Conference on Computer Engineering and Networks10.1007/978-981-19-6901-0_106(1029-1038)Online publication date: 20-Oct-2022
      • (2020)Knowledge Graph Oriented Information ExtractionHans Journal of Data Mining10.12677/HJDM.2020.10403010:04(282-302)Online publication date: 2020
      • (2020)Multi-granularity semantic representation model for relation extractionNeural Computing and Applications10.1007/s00521-020-05464-8Online publication date: 10-Nov-2020
      • Show More Cited By

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      EPUB

      View this article in ePub.

      ePub

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media