ACM Home Page
Please provide us with feedback. Feedback
Extracting contrastive information from negation patterns in biomedical literature
Full text PdfPdf (155 KB)
Source ACM Transactions on Asian Language Information Processing (TALIP) archive
Volume 5 ,  Issue 1  (March 2006) table of contents
Pages: 44 - 60  
Year of Publication: 2006
ISSN:1530-0226
Authors
Jung-Jae Kim  Korea Advanced Institute of Science and Technology, Daejeon, South Korea
Jong C. Park  Korea Advanced Institute of Science and Technology, Daejeon, South Korea
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 10,   Downloads (12 Months): 55,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
Save this Article to a Binder    Display Formats: BibTex  EndNote ACM Ref   
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1131348.1131352
What is a DOI?

ABSTRACT

Expressions of negation in the biomedical literature often encode information of contrast as a means for explaining significant differences between the objects that are so contrasted. We show that such information gives additional insights into the nature of the structures and/or biological functions of these objects, leading to valuable knowledge for subcategorization of protein families by the properties that the involved proteins do not have in common. Based on the observation that the expressions of negation employ mostly predictable syntactic structures that can be characterized by subclausal coordination and by clause-level parallelism, we present a system that extracts such contrastive information by identifying those syntactic structures with natural language processing techniques and with additional linguistic resources for semantics. The implemented system shows the performance of 85.7% precision and 61.5% recall, including 7.7% partial recall, or an F score of 76.6. We apply the system to the biological interactions as extracted by our biomedical information-extraction system in order to enrich proteome databases with contrastive information.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
Alfarano, C., et al. 2005. The Biomolecular Interaction Network Database and related tools 2005 update. Nucl. Acids. Res. 33(Database Issue), D418--424.
 
2
 
3
Bader, G., Betel, D., and Hogue, C. 2003. BIND: The biomolecular interaction network database. Nucl. Acids. Res., 31, 1, 248--250.
 
4
Boeckmann, B., et al. 2003. The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003. Nucl. Acids. Res., 31, 1, 365--370.
 
5
 
6
Donaldson, I., et al. 2003. PreBIND and Textomy - mining the biomedical literature for protein-protein interactions using a support vector machine. BMC Bioinformatics, 4--11.
 
7
Fellbaum, C. 1998. WordNet: An Electronic Lexical Database. MIT Press, Cambridge, MA.
 
8
Friedman, C., Alderson, P., Austin, J., Cimino, J., and Johnson, S. 1994. A general natural-language text processor for clinical radiology. J. Am. Med. Inform. Assoc., 1, 2, 161--174.
 
9
Horn, L. 1989. A Natural History of Negation. University of Chicago Press, Chicago, IL.
 
10
Kim, J. and Park, J. 2004a. Annotation of gene products in the literature with Gene Ontology terms using syntactic dependencies. In Proc. International Joint Conference on Natural Language Processing. 528--34.
 
11
Kim, J. and Park, J. 2004b. BioIE: Retargetable information extraction and ontological annotation of biological interactions from the literature. J. Bioinformatics and Computational Biology 2, 3, 551--568.
 
12
Marcotte, E., Xenarios, I., and Eisenberg, D. 2001. Mining literature for protein-protein interactions. Bioinformatics 17, 4, 359--363.
 
13
Mulder, N., et al. 2003. The InterPro database, 2003 brings increased coverage and new features. Nucl. Acids. Res. 31, 1, 315--318.
 
14
Mutalik, P., Deshpande, A., and Nadkarni, P. 2001. Use of general-purpose negation detection to augment concept indexing of medical documents: A quantitative study using the UMLS. J. Am. Med. Inform. Assoc. 8, 6, 598--609.
 
15
Ono, T., Hishigaki, H., Tanigami, A., and Takagi, T. 2001. Automated extraction of information on protein-protein interactions from the biological literature. Bioinformatics 17, 2, 155--161.
 
16
Prince, E. 1992. The ZPG letter: Subjects, definiteness and information-status. In Discourse Description: Diverse Analyses of a Fund-Raising Text. W. Mann and S. Thompson, Ed., John Benjamins, Amsterdam. 295--325.
 
17
Robert, S., et al. 1993. More informative abstracts of articles describing clinical practice guidelines. Annals of Internal Medicine, 118, 9, 731--737.
 
18
 
19
The Gene Ontology Consortium. 2004. The Gene Ontology (GO) database and informatics resource. Nucl. Acids. Res. 32(Database issue), D258--261.
 
20
Thompson, G., Pacheco, E., Melo, E., and Castilho, B. 2000. Conserved sequences in the beta subunit of archaeal and eukaryal translation initiation factor 2 (eIF2), absent from eIF5, mediate interaction with eIF2gamma. Biochem. J. 347, 703--709.
 
21
Xenarios, I., et al. 2002. DIP, the database of interacting proteins: A research tool for studying cellular networks of protein interactions. Nucl. Acids. Res. 30, 1, 303--305.

Collaborative Colleagues:
Jung-Jae Kim: colleagues
Jong C. Park: colleagues