skip to main content
10.1145/1526709.1526797acmconferencesArticle/Chapter ViewAbstractPublication PagesthewebconfConference Proceedingsconference-collections
research-article

Measuring the similarity between implicit semantic relations from the web

Published: 20 April 2009 Publication History

Abstract

Measuring the similarity between semantic relations that hold among entities is an important and necessary step in various Web related tasks such as relation extraction, information retrieval and analogy detection. For example, consider the case in which a person knows a pair of entities (e.g. Google, YouTube), between which a particular relation holds (e.g. acquisition). The person is interested in retrieving other such pairs with similar relations (e.g. Microsoft, Powerset). Existing keyword-based search engines cannot be applied directly in this case because, in keyword-based search, the goal is to retrieve documents that are relevant to the words used in a query -- not necessarily to the relations implied by a pair of words. We propose a relational similarity measure, using a Web search engine, to compute the similarity between semantic relations implied by two pairs of words. Our method has three components: representing the various semantic relations that exist between a pair of words using automatically extracted lexical patterns, clustering the extracted lexical patterns to identify the different patterns that express a particular semantic relation, and measuring the similarity between semantic relations using a metric learning approach. We evaluate the proposed method in two tasks: classifying semantic relations between named entities, and solving word-analogy questions. The proposed method outperforms all baselines in a relation classification task with a statistically significant average precision score of 0.74. Moreover, it reduces the time taken by Latent Relational Analysis to process 374 word-analogy questions from 9 days to less than 6 hours, with an SAT score of 51%.

References

[1]
K. Barker and S. Szpakowicz. Semi-automatic recognition of noun modifier relationships. In Proc. of COLING'98, pages 96--102, 1998.
[2]
M. Berland and E. Charniak. Finding parts in very large corpora. In Proc. of ACL'99, pages 57--64, 1999.
[3]
R. Bhagat and D. Ravichandran. Large scale acquisition of paraphrases for learning surface patterns. In Proc. of ACL'08: HLT, pages 674--682, 2008.
[4]
E. Bicici and D. Yuret. Clustering word pairs to answer analogy questions. In Proc. of TAINN'06, 2006.
[5]
D. Bollegala, Y. Matsuo, and M. Ishizuka. Www sits the sat: Measuring relational similarity on the web. In Proc. of ECAI'08, pages 333--337, 2008.
[6]
R. C. Bunescu and R. Mooney. Learning to extract relations from the web using minimal supervision. In Proc. of ACL'07, pages 576--583, 2007.
[7]
P. Cimiano and J. Wenderoth. Automatic acquisition of ranked qualia structures from the web. In Proc. of ACL'07, pages 888--895, 2007.
[8]
A. Culotta and J. Sorensen. Dependency tree kernels for relation extraction. In Proc. of ACL'04, pages 423--429, 2004.
[9]
D. Davidov and A. Rappoport. Classification of semantic relationships between nominals using pattern clusters. In Proc. of the ACL'08, 2008.
[10]
D. Davidov and A. Rappoport. Unsupervised discovery of generic relationships using pattern clusters and its evaluation by automatically generated sat analogy questions. In Proc. of ACL'08-HLT, pages 692--700, 2008.
[11]
J. V. Davis and I. S. Dhillon. Differential entropic clustering of multivariate gaussians. In Proc. of NIPS'06, pages 337--344, 2006.
[12]
J. V. Davis and I. S. Dhillon. Structured metric learning for high dimensional problems. In Proc. of KDD '08, pages 195--203, 2008.
[13]
J. V. Davis, B. Kulis, P. Jain, S. Sra, and I. S. Dhillon. Information--theoretic metric learning. In IProc. of CML'07, pages 209--216, 2007.
[14]
O. Etzioni, M. Cafarella, D. Downey, A. Popescu, T. Shaked, S. Soderl, D. S. Weld, and E. Yates. Unsupervised named-entity extraction from the web: An experimental study. Artificial Intelligence, 165:91--134, 2005.
[15]
B. Falkenhainer, K. Forbus, and D. Gentner. Structure mapping engine: Algorithm and examples. Artificial Intelligence, 41:1--63, 1989.
[16]
Z. Harris. Distributional structure. Word, 10:146--162, 1954.
[17]
M. Hearst. Automatic acquisition of hyponyms from large text corpora. In Proc. of 14th COLING, pages 539--545, 1992.
[18]
D. Lin. Automatic retrieval and clustering of similar words. In Proc. of COLING-ACL'98, pages 768--774, 1998.
[19]
D. Lin and P. Pantel. Dirt: Discovery of inference rules from text. In Proc. of ACM SIGKDD'01, pages 323--328, 2001.
[20]
P. Mangalath, J. Quesada, and W. Kintsch. Analogy-making as predictation using relational information and lsa vectors. In Proc. of Int'l Conf. on Research in Computational Linguistics, 2004.
[21]
Z. Marx, D. Ido, B. Joachim, and S. Eli. Coupled clustering: A method for detecting structural correspondance. Journal of Machine Learning Research, 3:747--780, 2002.
[22]
D. Medin, R. Goldstone, and D. Gentner. Respects for similarity. Psychological Review, 6(1):1--28, 1991.
[23]
G. Miller, R. Beckwith, C. Fellbaum, D. Gross, and K. Miller. Introducton to wordnet: An on-line lexical database. International Journal of Lexicography, 3:238--244, 1990.
[24]
P. Nakov and M. Hearst. Solving relational similarity problems using the web as a corpus. In Proc. of ACL'08-HLT, pages 452--460, 2008.
[25]
M. Pasca, D. Lin, J. Bigham, A. Lifchits, and A. Jain. Organizing and searching the world wide web of facts -- step one: the one-million fact extraction challenge. In Proc. of AAAI'06, pages 1400--1405, 2006.
[26]
J. Pei, J. Han, B. Mortazavi-Asi, J. Wang, H. Pinto, Q. Chen, U. Dayal, and M. Hsu. Mining sequential patterns by pattern-growth: the prefixspan approach. IEEE Transactions on Knowledge and Data Engineering, 16(11):1424--1440, 2004.
[27]
J. Platt. Probabilistic outputs for support vector machines and comparison to regularized likelihood methods. Advances in Large Margin Classifiers, pages 61--74, 2000.
[28]
D. Ravichandran and E. Hovy. Learning surface text patterns for a question answering system. In Proc. of ACL '02, pages 41--47, 2001.
[29]
G. Salton and C. Buckley. Introduction to Modern Information Retreival. McGraw-Hill Book Company, 1983.
[30]
R. Snow, D. Jurafsky, and A. Ng. Learning syntactic patterns for automatic hypernym discovery. In Proc. of Advances in Neural Information Processing Systems (NIPS) 17, pages 1297--1304, 2005.
[31]
P. Turney. Measuring semantic similarity by latent relational analysis. In Proc. of IJCAI'05, pages 1136--1141, 2005.
[32]
P. Turney. Expressing implicit semantic relations without supervision. In Proc. of Coling/ACL'06, pages 313--320, 2006.
[33]
P. Turney. Similarity of semantic relations. Computational Linguistics, 32(3):379--416, 2006.
[34]
P. Turney and M. Littman. Corpus--based learning of analogies and semantic relations. Machine Learning, 60:251--278, 2005.
[35]
P. Turney, M. Littman, J. Bigham, and V. Shnayder. Combining independent modules to solve multiple-choice synonym and analogy problems. In Proc. of RANLP'03, pages 482--486, 2003.
[36]
A. Tversky. Features of similarity. Psychological Review, 84(4):327--352, 1997.
[37]
T. Veale. The analogical thesaurus. In Proc. of 15th Innovative Applications of Artificial Intelligence Conference (IAAI'03), pages 137--142, 2003.
[38]
T. Veale. Wordnet sits the sat: A knowledge-based approach to lexical analogy. In Proc. of ECAI'04, pages 606--612, 2004.
[39]
T. Veale and M. T. Keane. The competence of structure mapping on hard analogies. In Proc. of IJCAI'03, 2003.
[40]
D. Zelenko, C. Aone, and A. Richardella. Kernel methods for relation extraction. Journal of Machine Learning Research, 3:1083--1106, 2003.

Cited By

View all
  • (2021)Data Set and Evaluation of Automated Construction of Financial Knowledge GraphData Intelligence10.1162/dint_a_001083:3(418-443)Online publication date: 8-Sep-2021
  • (2019)Overview of Knowledge Mapping Construction Technology2019 IEEE 8th Joint International Information Technology and Artificial Intelligence Conference (ITAIC)10.1109/ITAIC.2019.8785831(1572-1578)Online publication date: May-2019
  • (2018)Applying Semantic Relations for Automatic Topic Ontology ConstructionDevelopments and Trends in Intelligent Technologies and Smart Systems10.4018/978-1-5225-3686-4.ch004(48-77)Online publication date: 2018
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
WWW '09: Proceedings of the 18th international conference on World wide web
April 2009
1280 pages
ISBN:9781605584874
DOI:10.1145/1526709

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 April 2009

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. natural language processing
  2. relational similarity
  3. web mining

Qualifiers

  • Research-article

Conference

WWW '09
Sponsor:

Acceptance Rates

Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)8
  • Downloads (Last 6 weeks)2
Reflects downloads up to 20 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2021)Data Set and Evaluation of Automated Construction of Financial Knowledge GraphData Intelligence10.1162/dint_a_001083:3(418-443)Online publication date: 8-Sep-2021
  • (2019)Overview of Knowledge Mapping Construction Technology2019 IEEE 8th Joint International Information Technology and Artificial Intelligence Conference (ITAIC)10.1109/ITAIC.2019.8785831(1572-1578)Online publication date: May-2019
  • (2018)Applying Semantic Relations for Automatic Topic Ontology ConstructionDevelopments and Trends in Intelligent Technologies and Smart Systems10.4018/978-1-5225-3686-4.ch004(48-77)Online publication date: 2018
  • (2018)Causality Patterns for Detecting Adverse Drug Reactions From Social Media: Text Mining ApproachJMIR Public Health and Surveillance10.2196/publichealth.82144:2(e51)Online publication date: 9-May-2018
  • (2018)Protein-protein interaction identification using a similarity-constrained graph modelIEEE/ACM Transactions on Computational Biology and Bioinformatics10.1109/TCBB.2017.2777448(1-1)Online publication date: 2018
  • (2015)Constrained information-theoretic tripartite graph clustering to identify semantically similar relationsProceedings of the 24th International Conference on Artificial Intelligence10.5555/2832747.2832790(3882-3889)Online publication date: 25-Jul-2015
  • (2015)A method for finding similes to support final selection from search resultsProceedings of the 9th International Conference on Ubiquitous Information Management and Communication10.1145/2701126.2701197(1-4)Online publication date: 8-Jan-2015
  • (2015)Protein-protein interaction identification using a hybrid modelArtificial Intelligence in Medicine10.1016/j.artmed.2015.05.00364:3(185-193)Online publication date: 1-Jul-2015
  • (2014)Co-Clustering Algorithm: Batch, Mini-Batch, and OnlineInternational Journal of Information and Electronics Engineering10.7763/IJIEE.2014.V4.4614:5Online publication date: 2014
  • (2014)Searching for analogical ideas with crowdsProceedings of the SIGCHI Conference on Human Factors in Computing Systems10.1145/2556288.2557378(1225-1234)Online publication date: 26-Apr-2014
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media