Similarity-based Distant Supervision for Definition Retrieval

Authors:
Jiepu Jiang

University of Massachusetts Amherst, Amherst, MA, USA

University of Massachusetts Amherst, Amherst, MA, USA
View Profile

,
James Allan

University of Massachusetts Amherst, Amherst, MA, USA

University of Massachusetts Amherst, Amherst, MA, USA
View Profile

CIKM '17: Proceedings of the 2017 ACM on Conference on Information and Knowledge ManagementNovember 2017Pages 527–536https://doi.org/10.1145/3132847.3133032

Published:06 November 2017Publication History

CIKM '17: Proceedings of the 2017 ACM on Conference on Information and Knowledge Management

Pages 527–536

ABSTRACT

Recognizing definition sentences from free text corpora often requires hand-crafted patterns or explicitly labeled training instances. We present a distant supervision approach addressing this challenge without using explicitly labeled data. We use plausibly good but imperfect definition sentences from Wikipedia as references to annotate sentences in a target corpus based on text similarity measures such as ROUGE. Experimental results show our approach is highly effective, generating noisy but large, useful, and localized training instances. Definition sentence retrieval models trained using the synthesized training examples are more effective than those learned from manual judgments of a few thousand sentences. We also examine different text similarity measures for annotation, including both unsupervised and supervised ones. We show that our method can significantly benefit from supervised text similarity measures learned from either external training data (from the SemEval Semantic Text Similarity task) or local ones (a few hundred judged sentences on the target corpus). Our method offers a cheap, effective, and flexible solution to this task and can benefit a broad range of applications such as web search engines and QA systems.

References

E. Agirre, C. Banea, D. Cer, M. Diab, A. Gonzalez-Agirre, R. Mihalcea, G. Rigau, and J. Wiebe. SemEval-2016 task 1: Semantic textual similarity, monolingual and cross-lingual evaluation. In Proceedings of SemEval-2016, pages 497--511, 2016.Google ScholarCross Ref
M. S. Bernstein, J. Teevan, S. Dumais, D. Liebling, and E. Horvitz. Direct answers for search queries in the long tail. In CHI '12, pages 237--246, 2012. Google ScholarDigital Library
G. Boella and L. Di Caro. Extracting definitions and hypernym relations relying on syntactic dependencies and support vector machines. In ACL '13, pages 532--537, 2013.Google Scholar
T. Brychcın and L. Svoboda. UWB at SemEval-2016 task 1: Semantic textual similarity using lexical, syntactic, and semantic information. In Proceedings of SemEval-2016, pages 588--594, 2016.Google ScholarCross Ref
C. J. Burges. From RankNet to LambdaRank to LambdaMART: An overview. Technical Report MSR-TR-2010-82, Microsoft Research, 2010.Google Scholar
B. Carterette and J. Allan. Semiautomatic evaluation of retrieval systems using document similarities. In CIKM '07, pages 873--876, 2007. Google ScholarDigital Library
L. B. Chilton and J. Teevan. Addressing people's information needs directly in a web search result page. In WWW '11, pages 27--36, 2011. Google ScholarDigital Library
H. Cui, M.-Y. Kan, and T.-S. Chua. Generic soft pattern models for definitional question answering. In SIGIR '05, pages 384--391, 2005. Google ScholarDigital Library
H. Cui, M.-Y. Kan, and T.-S. Chua. Soft pattern matching models for definitional question answering. ACM Transactions on Information Systems, 25(2), 2007. Google ScholarDigital Library
L. Espinosa-Anke, F. Ronzano, and H. Saggion. Weakly supervised definition extraction. In Proceedings of Recent Advances in Natural Language Processing, pages 176--185, 2015.Google Scholar
L. Espinosa-Anke and H. Saggion. Applying dependency relations to definition extraction. In Proceedings of the 19th International Conference on Applications of Natural Language to Information Systems, pages 63--74, 2014.Google ScholarCross Ref
M. A. Hearst. Automatic acquisition of hyponyms from large text corpora. In COLING '92, pages 539--545, 1992. Google ScholarDigital Library
E. Hovy, C.-Y. Lin, L. Zhou, and J. Fukumoto. Automated summarization evaluation with basic elements. In LREC '06, pages 899--902, 2006.Google Scholar
A. Intxaurrondo, E. Agirre, O. L. de Lacalle, and M. Surdeanu. Diamonds in the rough: Event extraction from imperfect microblog data. In NAACL '15, pages 641--650, 2015.Google ScholarCross Ref
K. Järvelin and J. Kekäläinen. IR evaluation methods for retrieving highly relevant documents. In SIGIR '00, pages 41--48, 2000. Google ScholarDigital Library
Y. Jin, M.-Y. Kan, J.-P. Ng, and X. He. Mining scientific terms and their definitions: A study of the ACL anthology. In EMNLP '13, pages 780--790, 2013.Google Scholar
K.-W. Kor and T.-S. Chua. Interesting nuggets and their impact on definitional question answering. In SIGIR '07, pages 335--342, 2007. Google ScholarDigital Library
R. Krovetz. Viewing morphology as an inference process. In SIGIR '93, pages 191--202, 1993. Google ScholarDigital Library
Q. Le and T. Mikolov. Distributed representations of sentences and documents. In ICML '14, pages 1188--1196, 2014. Google ScholarDigital Library
C.-Y. Lin. ROUGE: A package for automatic evaluation of summaries. In ACL '04 Workshop on Text Summarization Branches Out, 2004.Google Scholar
C.-Y. Lin and E. Hovy. Automatic evaluation of summaries using n-gram co-occurrence statistics. In NAACL '03, pages 71--78, 2003. Google ScholarDigital Library
C. D. Manning, M. Surdeanu, J. Bauer, J. Finkel, S. J. Bethard, and D. McClosky. The Stanford CoreNLP natural language processing toolkit. In ACL '14, pages 55--60, 2014.Google ScholarCross Ref
T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, and J. Dean. Distributed representations of words and phrases and their compositionality. In NIPS '13, pages 3111--3119, 2013. Google ScholarDigital Library
M. Mintz, S. Bills, R. Snow, and D. Jurafsky. Distant supervision for relation extraction without labeled data. In ACL '09, pages 1003--1011, 2009. Google ScholarDigital Library
S. Muresan and J. Klavans. A method for automatically building and evaluating dictionary resources. In LREC '02, pages 231--234, 2002.Google Scholar
R. Navigli and P. Velardi. Learning word-class lattices for definition and hypernym extraction. In ACL '10, pages 1318--1327, 2010. Google ScholarDigital Library
A. Nenkova and R. Passonneau. Evaluating content selection in summarization: The pyramid method. In NAACL '04, 2004.Google Scholar
J.-P. Ng and V. Abrecht. Better summarization evaluation with word embeddings for ROUGE. In EMNLP '15, pages 1925--1930, 2015.Google ScholarCross Ref
K. Papineni, S. Roukos, T. Ward, and W.-J. Zhu. BLEU: A method for automatic evaluation of machine translation. In ACL '02, pages 311--318, 2002. Google ScholarDigital Library
R. J. Passonneau, E. Chen, W. Guo, and D. Perin. Automated pyramid scoring of summaries using distributional semantics. In ACL '13, pages 143--147, 2013.Google Scholar
J. Pennington, R. Socher, and C. D. Manning. Glove: Global vectors for word representation. In EMNLP '14, pages 1532--1543, 2014.Google ScholarCross Ref
M. Purver and S. Battersby. Experimenting with distant supervision for emotion classification. In EACL '12, pages 482--491, 2012. Google ScholarDigital Library
M. Reiplinger, U. Schäfer, and M. Wolska. Extracting glossary sentences from scholarly articles: A comparative evaluation of pattern bootstrapping and deep analysis. In Proceedings of the ACL-2012 Special Workshop on Rediscovering 50 Years of Discoveries, pages 55--65, 2012. Google ScholarDigital Library
X. Ren, A. El-Kishky, C. Wang, F. Tao, C. R. Voss, and J. Han. ClusType: Effective entity recognition and typing by relation phrase-based clustering. In KDD '15, pages 995--1004, 2015. Google ScholarDigital Library
R. Snow, D. Jurafsky, and A. Y. Ng. Learning syntactic patterns for automatic hypernym discovery. In NIPS '04, pages 1297--1304, 2004. Google ScholarDigital Library
M. A. Sultan, S. Bethard, and T. Sumner. DLS@CU: Sentence similarity from word alignment and semantic vector composition. In Proceedings of SemEval-2015, pages 148--153, 2015.Google ScholarCross Ref
E. Westerhout and P. Monachesi. Extraction of Dutch definitory contexts for eLearning purposes. In Proceedings of the 17th Meeting of Computational Linguistics in the Netherlands, pages 219--234, 2007.Google Scholar
F. Wu and D. S. Weld. Open information extraction using Wikipedia. In ACL '10, pages 118--127, 2010. Google ScholarDigital Library
J. Xu, R. Weischedel, and A. Licuanan. Evaluation of an extraction-based approach to answering definitional questions. In SIGIR '04, pages 418--424, 2004. Google ScholarDigital Library

Index Terms

Similarity-based Distant Supervision for Definition Retrieval
1. Information systems
  1. Information retrieval
    1. Evaluation of retrieval results
      1. Relevance assessment
    2. Retrieval models and ranking
      1. Similarity measures

Recommendations

Distant Supervision in BERT-based Adhoc Document Retrieval
CIKM '20: Proceedings of the 29th ACM International Conference on Information & Knowledge Management

Recently introduced pre-trained contextualized autoregressive models like BERT have shown improvements in document retrieval tasks. One of the major limitations of the current approaches can be attributed to the manner they deal with variable-size ...
Read More
Using semantic similarity to reduce wrong labels in distant supervision for relation extraction

Distant supervision (DS) has the advantage of automatically generating large amounts of labelled training data and has been widely used for relation extraction. However, there are usually many wrong labels in the automatically labelled data in distant ...
Read More
Distant supervision for relation extraction with hierarchical attention-based networks
Abstract
Distant supervision employs external knowledge bases to automatically label corpora. The labeled sentences in a corpus are usually packaged and trained for relation extraction using a multi-instance learning paradigm. The automated ...
Highlights
- Propose a novel hierarchical attention-based networks for relation extraction.
- ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
CIKM '17: Proceedings of the 2017 ACM on Conference on Information and Knowledge Management
November 2017
2604 pages
ISBN:9781450349185
DOI:10.1145/3132847
General Chairs:
Ee-Peng Lim
Singapore Management University, Singapore
,
Marianne Winslett
University of Illinois at Urbana-Champaign, USA, and Advanced Digital Sciences Center, Singapore
,
Program Chairs:
Mark Sanderson
RMIT, Australia
,
Ada Fu
Chinese University of Hong Kong, Hong Kong
,
Jimeng Sun
Georgia Tech, USA
,
Shane Culpepper
RMIT, Australia
,
Eric Lo
Chinese University of Hong Kong, Hong Kong
,
Joyce Ho
Emory University, USA
,
Debora Donato
Mix Tech, Inc., USA
,
Rakesh Agrawal
Data Insights Laboratories, USA
,
Yu Zheng
Microsoft Research Asia, China
,
Carlos Castillo
Qatar Computing Research Institute, Qatar
,
Aixin Sun
Nanyang Technological University, Singapore
,
Vincent S. Tseng
National Cheng Kung University, Taiwan
,
Chenliang Li
Wuhan University, China
Copyright © 2017 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 6 November 2017
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
definition sentence retrieval
definitional question answering
distant supervision
semantic textual similarity
Qualifiers
- research-article
Conference

Acceptance Rates
CIKM '17 Paper Acceptance Rate171of855submissions,20%Overall Acceptance Rate1,861of8,427submissions,22%
More
Upcoming Conference
CIKM '24

Sponsor:

sigir

sigir

The 33rd ACM International Conference on Information and Knowledge Management

October 21 - 25, 2024

Boise , ID , USA
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 1
  Total Citations
  View Citations
- 324
  Total Downloads
- Downloads (Last 12 months)22
- Downloads (Last 6 weeks)4
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Similarity-based Distant Supervision for Definition Retrieval

CIKM '17: Proceedings of the 2017 ACM on Conference on Information and Knowledge Management

ABSTRACT

References

Cited By

Index Terms

Recommendations

Distant Supervision in BERT-based Adhoc Document Retrieval

Using semantic similarity to reduce wrong labels in distant supervision for relation extraction

Distant supervision for relation extraction with hierarchical attention-based networks