article

Terminology-based knowledge mining for new knowledge discovery

Authors:

Sophia Ananiadou,

Katsumori MatsushimaAuthors Info & Claims

ACM Transactions on Asian Language Information Processing (TALIP), Volume 5, Issue 1

Pages 74 - 88

https://doi.org/10.1145/1131348.1131354

Published: 01 March 2006 Publication History

Abstract

In this article we present an integrated knowledge-mining system for the domain of biomedicine, in which automatic term recognition, term clustering, information retrieval, and visualization are combined. The primary objective of this system is to facilitate knowledge acquisition from documents and aid knowledge discovery through terminology-based similarity calculation and visualization of automatically structured knowledge. This system also supports the integration of different types of databases and simultaneous retrieval of different types of knowledge. In order to accelerate knowledge discovery, we also propose a visualization method for generating similarity-based knowledge maps. The method is based on real-time terminology-based knowledge clustering and categorization and allows users to observe real-time generated knowledge maps, graphically. Lastly, we discuss experiments using the GENIA corpus to assess the practicality and applicability of the system.

References

[1]

Ananiadou, S. and Nenadic, G. 2006. Automatic terminology management in biomedicine. In Text Mining for Biology and Biomedicine, S. Ananiadou and J. McNaught (eds), Artech House, Norwood, MA, Ch.4, 67--98.]]

[2]

Ananiadou, S., Friedman, C., and Tsujii, J. (Eds). 2004. Named entity recognition in biomedicine. J. Biomedical Informatics 37, 6. Special issue.]]

[3]

Berners-Lee, T. 1998. The Semantic Web as a language of logic. Available at: www.w3.org/DesignIssues/Logic.html.]]

[4]

Brickle, D. and Guha, R. 2000. Resource description framework (RDF) schema specification 1.0, W3C Candidate Recommendation. Available at: http://www.w3.org/TR/rdf-schema.]]

[5]

Collier, N., Nobata, C., and Tsujii, J. 2000. Extracting the names of genes and gene products with a hidden Markov model. In Proceedings of the International Conference on Computational Linguistics (COLING 2000, Saarbrücken, Germany), 201--207.]]

[6]

Frantzi, K., Ananiadou, S., and Mima, H. 2000. Automatic recognition of multi-word terms. Int. J. Digital Libraries 3, 2, 117--132. Special issue.]]

[7]

Fukuda, K., Tsunoda, T., Tamura, A., and Takagi, T. 1998. Toward information extraction: Identifying protein names from biological papers. In Proceedings of the PSB-98 (Hawaii), 705--716.]]

[8]

Gaizauskas, R., Demetriou, G., and Humphreys, K. 2000. Term recognition and classification in biological science journal articles. In Proceedings of the Workshop on Computational Terminology for Medical and Biological Applications (NLP-2000, Patras, Greece), 37--44.]]

[9]

Gamper, J., Nejdl, W., and Wolpers, M. 1999. Combining ontologies and terminologies in information systems. In Proceedings of the 5th International Congress on Terminology and Knowledge Engineering, (Innsbruck, Austria), 152--168.]]

[10]

Genia Project. 2002. Genia project home page. www-tsujii.is.s.u-tokyo.ac.jp/GENIA/.]]

[11]

Hatzivassiloglou, V., Duboue, P., and Rzhetsky, A. 2001. Disambiguating proteins, genes, and RNA in text: A machine learning approach. Bioinformatics 17, 1, S97--S106.]]

[12]

Jacquemin, C. 2001. Spotting and Discovering Terms through NLP. MIT Press, Cambridge, MA, 378.]]

[13]

Krauthammer, M., Rzhetsky, A., Morozov, P., and Friedman, C. 2000. Using BLAST for identifying gene and protein names in journal articles. Gene 259, 245--252.]]

[14]

Krauthammer, M. and Nenadic, G. 2004. Term identification in the biomedical literature. J. Biomedical Informatics. Special issue on named entity recognition in biomedicine.]]

[15]

Medline (National Library of Medicine). 2002. http://www.ncbi.nlm.nih.gov/ /.]]

[16]

Mima, H., Ananiadou, S., and Nenadic, G. 2001a. ATRACT workbench: An automatic term recognition and clustering of terms. In Text, Speech and Dialogue, V. Matoušek et al. (eds.), LNAI 2166, Springer Verlag, 126--133.]]

[17]

Mima, H. and Ananiadou, S. 2001b. An application and evaluation of the C/NC-value approach for the automatic term recognition of multi-word units in Japanese. Int. J. Terminology 6/2, 175--194.]]

[18]

Nenadic, G., Ananiadou, S., and McNaught, J. 2004. Enhancing automatic term recognition through term variation, In Proceedings of the 20th International Conference on Computational Linguistics (COLING 2004, Geneva, Switzerland).]]

[19]

Spasic, I., Ananiadou, S., and Tsujii, J. 2005a. Masterclass: A case-based reasoning system for the classification of biomedical terms. Bioinformatics 21, 11, 2748--2758.]]

[20]

Spasic, I., Ananiadou, S., McNaught, J., and Kumar, A. 2005b. Text mining and ontologies in biomedicine: Making sense of raw text. Briefings in Bioinformatics 6, 3, 239--251.]]

[21]

TinySVM. 2004. http://chasen.org/~taku/software/TinySVM/.]]

[22]

UMLS. 2004. http://www.nlm.nih.gov/research/umls/.]]

[23]

Ushioda, A. 1996. Hierarchical clustering of words. In Proceedings of the International Conference on Computational Linguistics (COLING 1996, Copenhagen, Denmark), 1159--1162.]]

[24]

Visser, P. R. S., Jones, D. M., Bench-Capon, T. J. M., and Shave, M. J. R. 1997. An analysis of ontology mismatches---Heterogeneity versus interoperability. In Proceedings of the AAAI 1997 Spring Symposium on Ontological Engineering (Stanford University, Stanford, CA), 164--172.]]

[25]

Voutilainen, A. and Heikkila, J. 1993. An English constraint grammar (ENGCG), a surface-syntactic parser of English. In Creating and Using English Language Corpora, U. Fries et al. (eds.), Rodopi, Amsterdam, 189--199.]]

Cited By

Nakamura YSuzuki CMasuda KMima H(2017)Designing Research for Monitoring Humanities-based Interdisciplinary Studies: A Case of Cultural Resources Studies (Bunkashigengaku 文化資源学) in JapanJournal of the Japanese Association for Digital Humanities10.17928/jjadh.2.1_602:1(60-72)Online publication date: 2017
https://doi.org/10.17928/jjadh.2.1_60
Dai MSun CWang M(2014)Research on the Knowledge Character and Classification of Intangible Cultural HeritageApplied Mechanics and Materials10.4028/www.scientific.net/AMM.643.153643(153-158)Online publication date: Sep-2014
https://doi.org/10.4028/www.scientific.net/AMM.643.153
Al-Azzam OWu JAl-Nimer LChitraranjan CDenton A(2014)A Weighted Density-Based Approach for Identifying Standardized Items that are Significantly Related to the Biological LiteratureData Mining for Service10.1007/978-3-642-45252-9_6(79-96)Online publication date: 4-Jan-2014
https://doi.org/10.1007/978-3-642-45252-9_6
Show More Cited By

Index Terms

Terminology-based knowledge mining for new knowledge discovery

Recommendations

Knowledge Discovery and Data Visualization: Theories and Perspectives

This article reviews the literature in the search for the theories and perspectives of knowledge discovery and data visualization. The literature review highlights the overview of knowledge discovery; Knowledge Discovery in Databases KDD; Knowledge ...
Research on Knowledge Extraction and Visualization in Knowledge Retrieve
IHMSC '09: Proceedings of the 2009 International Conference on Intelligent Human-Machine Systems and Cybernetics - Volume 02

In order to adapt to the development tendency of Knowledge Organization and resolve the problem about the low efficiency of Information Retrieve, Knowledge Retrieve as a new retrieval theory is proposed. Based on the knowledge organization, it realizes ...
Text analysis for ontology and terminology engineering

After a recent breakthrough in the early 90's, text analysis is acknowledged as one of the promising ways to rapidly build better grounded semantic resources such as terminologies and ontologies. This domain has recently undergone significant evolutions ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Asian Language Information Processing

ACM Transactions on Asian Language Information Processing Volume 5, Issue 1

March 2006

88 pages

ISSN:1530-0226

EISSN:1558-3430

DOI:10.1145/1131348

Issue’s Table of Contents

Copyright © 2006 ACM.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 March 2006

Published in TALIP Volume 5, Issue 1

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

7
Total Citations
View Citations
863
Total Downloads

Downloads (Last 12 months)8
Downloads (Last 6 weeks)0

Reflects downloads up to 17 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Nakamura YSuzuki CMasuda KMima H(2017)Designing Research for Monitoring Humanities-based Interdisciplinary Studies: A Case of Cultural Resources Studies (Bunkashigengaku 文化資源学) in JapanJournal of the Japanese Association for Digital Humanities10.17928/jjadh.2.1_602:1(60-72)Online publication date: 2017
https://doi.org/10.17928/jjadh.2.1_60
Dai MSun CWang M(2014)Research on the Knowledge Character and Classification of Intangible Cultural HeritageApplied Mechanics and Materials10.4028/www.scientific.net/AMM.643.153643(153-158)Online publication date: Sep-2014
https://doi.org/10.4028/www.scientific.net/AMM.643.153
Al-Azzam OWu JAl-Nimer LChitraranjan CDenton A(2014)A Weighted Density-Based Approach for Identifying Standardized Items that are Significantly Related to the Biological LiteratureData Mining for Service10.1007/978-3-642-45252-9_6(79-96)Online publication date: 4-Jan-2014
https://doi.org/10.1007/978-3-642-45252-9_6
NAGATSUNA KYOSHIDA M(2010)Developing Creative Mindset Through Engineering Experiments with a Wiki-based Knowledge Sharing SystemJournal of JSEE10.4307/jsee.58.4_11558:4(115-120)Online publication date: 2010
https://doi.org/10.4307/jsee.58.4_115
Ota SMima H(2009)Design and Implementation of an Issue-oriented Automatic Syllabus Categorization SystemJournal of Natural Language Processing10.5715/jnlp.16.4_9116:4(91-106)Online publication date: 2009
https://doi.org/10.5715/jnlp.16.4_91
Tan GSun CZhong Z(2009)Knowledge Representation of "Funeral Dance" Based on CIDOC CRMProceedings of the 2009 Second International Symposium on Knowledge Acquisition and Modeling - Volume 0110.1109/KAM.2009.163(39-42)Online publication date: 30-Nov-2009
https://dl.acm.org/doi/10.1109/KAM.2009.163
Sun CXu XLi XDeng S(2008)Knowledge Discovery from Virtual Enterprise Model Based on Semantic AnnotationProceedings of the 2008 Fifth International Conference on Fuzzy Systems and Knowledge Discovery - Volume 0510.1109/FSKD.2008.150(546-551)Online publication date: 18-Oct-2008
https://dl.acm.org/doi/10.1109/FSKD.2008.150

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Issue’s Table of Contents