research-article

Measuring the similarity between implicit semantic relations from the web

Authors:

Danushka T. Bollegala,

Mitsuru IshizukaAuthors Info & Claims

WWW '09: Proceedings of the 18th international conference on World wide web

Pages 651 - 660

https://doi.org/10.1145/1526709.1526797

Published: 20 April 2009 Publication History

Abstract

Measuring the similarity between semantic relations that hold among entities is an important and necessary step in various Web related tasks such as relation extraction, information retrieval and analogy detection. For example, consider the case in which a person knows a pair of entities (e.g. Google, YouTube), between which a particular relation holds (e.g. acquisition). The person is interested in retrieving other such pairs with similar relations (e.g. Microsoft, Powerset). Existing keyword-based search engines cannot be applied directly in this case because, in keyword-based search, the goal is to retrieve documents that are relevant to the words used in a query -- not necessarily to the relations implied by a pair of words. We propose a relational similarity measure, using a Web search engine, to compute the similarity between semantic relations implied by two pairs of words. Our method has three components: representing the various semantic relations that exist between a pair of words using automatically extracted lexical patterns, clustering the extracted lexical patterns to identify the different patterns that express a particular semantic relation, and measuring the similarity between semantic relations using a metric learning approach. We evaluate the proposed method in two tasks: classifying semantic relations between named entities, and solving word-analogy questions. The proposed method outperforms all baselines in a relation classification task with a statistically significant average precision score of 0.74. Moreover, it reduces the time taken by Latent Relational Analysis to process 374 word-analogy questions from 9 days to less than 6 hours, with an SAT score of 51%.

References

[1]

K. Barker and S. Szpakowicz. Semi-automatic recognition of noun modifier relationships. In Proc. of COLING'98, pages 96--102, 1998.

Digital Library

[2]

M. Berland and E. Charniak. Finding parts in very large corpora. In Proc. of ACL'99, pages 57--64, 1999.

Digital Library

[3]

R. Bhagat and D. Ravichandran. Large scale acquisition of paraphrases for learning surface patterns. In Proc. of ACL'08: HLT, pages 674--682, 2008.

[4]

E. Bicici and D. Yuret. Clustering word pairs to answer analogy questions. In Proc. of TAINN'06, 2006.

[5]

D. Bollegala, Y. Matsuo, and M. Ishizuka. Www sits the sat: Measuring relational similarity on the web. In Proc. of ECAI'08, pages 333--337, 2008.

Digital Library

[6]

R. C. Bunescu and R. Mooney. Learning to extract relations from the web using minimal supervision. In Proc. of ACL'07, pages 576--583, 2007.

[7]

P. Cimiano and J. Wenderoth. Automatic acquisition of ranked qualia structures from the web. In Proc. of ACL'07, pages 888--895, 2007.

[8]

A. Culotta and J. Sorensen. Dependency tree kernels for relation extraction. In Proc. of ACL'04, pages 423--429, 2004.

Digital Library

[9]

D. Davidov and A. Rappoport. Classification of semantic relationships between nominals using pattern clusters. In Proc. of the ACL'08, 2008.

[10]

D. Davidov and A. Rappoport. Unsupervised discovery of generic relationships using pattern clusters and its evaluation by automatically generated sat analogy questions. In Proc. of ACL'08-HLT, pages 692--700, 2008.

[11]

J. V. Davis and I. S. Dhillon. Differential entropic clustering of multivariate gaussians. In Proc. of NIPS'06, pages 337--344, 2006.

[12]

J. V. Davis and I. S. Dhillon. Structured metric learning for high dimensional problems. In Proc. of KDD '08, pages 195--203, 2008.

Digital Library

[13]

J. V. Davis, B. Kulis, P. Jain, S. Sra, and I. S. Dhillon. Information--theoretic metric learning. In IProc. of CML'07, pages 209--216, 2007.

Digital Library

[14]

O. Etzioni, M. Cafarella, D. Downey, A. Popescu, T. Shaked, S. Soderl, D. S. Weld, and E. Yates. Unsupervised named-entity extraction from the web: An experimental study. Artificial Intelligence, 165:91--134, 2005.

Digital Library

[15]

B. Falkenhainer, K. Forbus, and D. Gentner. Structure mapping engine: Algorithm and examples. Artificial Intelligence, 41:1--63, 1989.

Digital Library

[16]

Z. Harris. Distributional structure. Word, 10:146--162, 1954.

[17]

M. Hearst. Automatic acquisition of hyponyms from large text corpora. In Proc. of 14th COLING, pages 539--545, 1992.

Digital Library

[18]

D. Lin. Automatic retrieval and clustering of similar words. In Proc. of COLING-ACL'98, pages 768--774, 1998.

Digital Library

[19]

D. Lin and P. Pantel. Dirt: Discovery of inference rules from text. In Proc. of ACM SIGKDD'01, pages 323--328, 2001.

Digital Library

[20]

P. Mangalath, J. Quesada, and W. Kintsch. Analogy-making as predictation using relational information and lsa vectors. In Proc. of Int'l Conf. on Research in Computational Linguistics, 2004.

[21]

Z. Marx, D. Ido, B. Joachim, and S. Eli. Coupled clustering: A method for detecting structural correspondance. Journal of Machine Learning Research, 3:747--780, 2002.

Digital Library

[22]

D. Medin, R. Goldstone, and D. Gentner. Respects for similarity. Psychological Review, 6(1):1--28, 1991.

[23]

G. Miller, R. Beckwith, C. Fellbaum, D. Gross, and K. Miller. Introducton to wordnet: An on-line lexical database. International Journal of Lexicography, 3:238--244, 1990.

[24]

P. Nakov and M. Hearst. Solving relational similarity problems using the web as a corpus. In Proc. of ACL'08-HLT, pages 452--460, 2008.

[25]

M. Pasca, D. Lin, J. Bigham, A. Lifchits, and A. Jain. Organizing and searching the world wide web of facts -- step one: the one-million fact extraction challenge. In Proc. of AAAI'06, pages 1400--1405, 2006.

Digital Library

[26]

J. Pei, J. Han, B. Mortazavi-Asi, J. Wang, H. Pinto, Q. Chen, U. Dayal, and M. Hsu. Mining sequential patterns by pattern-growth: the prefixspan approach. IEEE Transactions on Knowledge and Data Engineering, 16(11):1424--1440, 2004.

Digital Library

[27]

J. Platt. Probabilistic outputs for support vector machines and comparison to regularized likelihood methods. Advances in Large Margin Classifiers, pages 61--74, 2000.

[28]

D. Ravichandran and E. Hovy. Learning surface text patterns for a question answering system. In Proc. of ACL '02, pages 41--47, 2001.

Digital Library

[29]

G. Salton and C. Buckley. Introduction to Modern Information Retreival. McGraw-Hill Book Company, 1983.

Digital Library

[30]

R. Snow, D. Jurafsky, and A. Ng. Learning syntactic patterns for automatic hypernym discovery. In Proc. of Advances in Neural Information Processing Systems (NIPS) 17, pages 1297--1304, 2005.

[31]

P. Turney. Measuring semantic similarity by latent relational analysis. In Proc. of IJCAI'05, pages 1136--1141, 2005.

Digital Library

[32]

P. Turney. Expressing implicit semantic relations without supervision. In Proc. of Coling/ACL'06, pages 313--320, 2006.

Digital Library

[33]

P. Turney. Similarity of semantic relations. Computational Linguistics, 32(3):379--416, 2006.

Digital Library

[34]

P. Turney and M. Littman. Corpus--based learning of analogies and semantic relations. Machine Learning, 60:251--278, 2005.

Digital Library

[35]

P. Turney, M. Littman, J. Bigham, and V. Shnayder. Combining independent modules to solve multiple-choice synonym and analogy problems. In Proc. of RANLP'03, pages 482--486, 2003.

[36]

A. Tversky. Features of similarity. Psychological Review, 84(4):327--352, 1997.

[37]

T. Veale. The analogical thesaurus. In Proc. of 15th Innovative Applications of Artificial Intelligence Conference (IAAI'03), pages 137--142, 2003.

[38]

T. Veale. Wordnet sits the sat: A knowledge-based approach to lexical analogy. In Proc. of ECAI'04, pages 606--612, 2004.

[39]

T. Veale and M. T. Keane. The competence of structure mapping on hard analogies. In Proc. of IJCAI'03, 2003.

Digital Library

[40]

D. Zelenko, C. Aone, and A. Richardella. Kernel methods for relation extraction. Journal of Machine Learning Research, 3:1083--1106, 2003.

Digital Library

Cited By

Wang WXu YDu CChen YWang YWen H(2021)Data Set and Evaluation of Automated Construction of Financial Knowledge GraphData Intelligence10.1162/dint_a_001083:3(418-443)Online publication date: 8-Sep-2021
https://doi.org/10.1162/dint_a_00108
Lu THu X(2019)Overview of Knowledge Mapping Construction Technology2019 IEEE 8th Joint International Information Technology and Artificial Intelligence Conference (ITAIC)10.1109/ITAIC.2019.8785831(1572-1578)Online publication date: May-2019
https://doi.org/10.1109/ITAIC.2019.8785831
Vairavasundaram SR. L(2018)Applying Semantic Relations for Automatic Topic Ontology ConstructionDevelopments and Trends in Intelligent Technologies and Smart Systems10.4018/978-1-5225-3686-4.ch004(48-77)Online publication date: 2018
https://doi.org/10.4018/978-1-5225-3686-4.ch004
Show More Cited By

Index Terms

Measuring the similarity between implicit semantic relations from the web
1. Information systems
  1. Information retrieval
    1. Retrieval tasks and goals
      1. Clustering and classification
  2. Information systems applications
    1. Data mining
      1. Clustering

Recommendations

Relational duality: unsupervised extraction of semantic relations between entities on the web
WWW '10: Proceedings of the 19th international conference on World wide web

Extracting semantic relations among entities is an important first step in various tasks in Web mining and natural language processing such as information extraction, relation detection, and social network mining. A relation can be expressed ...
Measuring the similarity between implicit semantic relations using web search engines
WSDM '09: Proceedings of the Second ACM International Conference on Web Search and Data Mining

Measuring the similarity between implicit semantic relations is an important task in information retrieval and natural language processing. For example, consider the situation where you know an entity-pair (e.g. Google, YouTube), between which a ...
Using Relational Similarity between Word Pairs for Latent Relational Search on the Web
WI-IAT '10: Proceedings of the 2010 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology - Volume 01

Latent relational search is a new search paradigm based on the degree of analogy between two word pairs. A latent relational search engine is expected to return the word Paris as an answer to the question mark (?) in the query {(Japan, Tokyo), (France, ?...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

WWW '09: Proceedings of the 18th international conference on World wide web

April 2009

1280 pages

ISBN:9781605584874

DOI:10.1145/1526709

General Chairs:
Juan Quemada
DIT-UPM
,
Gonzalo León
DIT-UPM
,
Program Chairs:
Yoelle Maarek
Google Inc., Israel
,
Wolfgang Nejdl
L3S and Hannover University

Copyright © 2009 IW3C2 org.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 April 2009

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

WWW '09

Sponsor:

WWW '09: The 18th International World Wide Web Conference

April 20 - 24, 2009

Madrid, Spain

Acceptance Rates

Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

49
Total Citations
View Citations
841
Total Downloads

Downloads (Last 12 months)8
Downloads (Last 6 weeks)2

Reflects downloads up to 20 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Wang WXu YDu CChen YWang YWen H(2021)Data Set and Evaluation of Automated Construction of Financial Knowledge GraphData Intelligence10.1162/dint_a_001083:3(418-443)Online publication date: 8-Sep-2021
https://doi.org/10.1162/dint_a_00108
Lu THu X(2019)Overview of Knowledge Mapping Construction Technology2019 IEEE 8th Joint International Information Technology and Artificial Intelligence Conference (ITAIC)10.1109/ITAIC.2019.8785831(1572-1578)Online publication date: May-2019
https://doi.org/10.1109/ITAIC.2019.8785831
Vairavasundaram SR. L(2018)Applying Semantic Relations for Automatic Topic Ontology ConstructionDevelopments and Trends in Intelligent Technologies and Smart Systems10.4018/978-1-5225-3686-4.ch004(48-77)Online publication date: 2018
https://doi.org/10.4018/978-1-5225-3686-4.ch004
Bollegala DMaskell SSloane RHajne JPirmohamed M(2018)Causality Patterns for Detecting Adverse Drug Reactions From Social Media: Text Mining ApproachJMIR Public Health and Surveillance10.2196/publichealth.82144:2(e51)Online publication date: 9-May-2018
https://doi.org/10.2196/publichealth.8214
Niu YWu HWang Y(2018)Protein-protein interaction identification using a similarity-constrained graph modelIEEE/ACM Transactions on Computational Biology and Bioinformatics10.1109/TCBB.2017.2777448(1-1)Online publication date: 2018
https://doi.org/10.1109/TCBB.2017.2777448
Wang CSong YRoth DWang CHan JJi HZhang M(2015)Constrained information-theoretic tripartite graph clustering to identify semantically similar relationsProceedings of the 24th International Conference on Artificial Intelligence10.5555/2832747.2832790(3882-3889)Online publication date: 25-Jul-2015
https://dl.acm.org/doi/10.5555/2832747.2832790
Ushiama TMatsuo NKim DKim SLee SHanzo LIsmail R(2015)A method for finding similes to support final selection from search resultsProceedings of the 9th International Conference on Ubiquitous Information Management and Communication10.1145/2701126.2701197(1-4)Online publication date: 8-Jan-2015
https://dl.acm.org/doi/10.1145/2701126.2701197
Niu YWang Y(2015)Protein-protein interaction identification using a hybrid modelArtificial Intelligence in Medicine10.1016/j.artmed.2015.05.00364:3(185-193)Online publication date: 1-Jul-2015
https://dl.acm.org/doi/10.1016/j.artmed.2015.05.003
Cho H(2014)Co-Clustering Algorithm: Batch, Mini-Batch, and OnlineInternational Journal of Information and Electronics Engineering10.7763/IJIEE.2014.V4.4614:5Online publication date: 2014
https://doi.org/10.7763/IJIEE.2014.V4.461
Yu LKittur AKraut RJones MPalanque PSchmidt AGrossman T(2014)Searching for analogical ideas with crowdsProceedings of the SIGCHI Conference on Human Factors in Computing Systems10.1145/2556288.2557378(1225-1234)Online publication date: 26-Apr-2014
https://dl.acm.org/doi/10.1145/2556288.2557378
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten