skip to main content
10.3115/1118935.1118956dlproceedingsArticle/Chapter ViewAbstractPublication PagesiralConference Proceedingsconference-collections
Article
Free access

Korean named entity recognition using HMM and CoTraining model

Published: 07 July 2003 Publication History

Abstract

Named entity recognition is important in sophisticated information service system such as Question Answering and Text Mining since most of the answer type and text mining unit depend on the named entity type. Therefore we focus on named entity recognition model in Korean. Korean named entity recognition is difficult since each word of named entity has not specific features such as the capitalizing feature of English. It has high dependence on the large amounts of hand-labeled data and the named entity dictionary, even though these are tedious and expensive to create. In this paper, we devise HMM based named entity recognizer to consider various context models. Furthermore, we consider weakly supervised learning technique, CoTraining, to combine labeled data and unlabeled data.

References

[1]
A. Borthwick. A Japanese named entiry recognizer constructed by a non-speaker of Japanese. In Proceedings of the IREX Workshop, pages 187--193, 1999.
[2]
A. Blum and T. Mitchell. Combining labeled and unlabeled data with cotraining. In Proceedings of the 11th Annual Conference on Computational Learning Theory, pages 92--100, 1998.
[3]
A. Ittycheriah, M. Franz, W. Zhu, and A. Ratnaparkhi, IBM's Statistical Question Answering System, In Proceedings of the Text Retrieval Conference TRECT-9, 2000.
[4]
D. M. Bikel, S. Miller, R. Schwartz, R. Weishedel, Nymble: a high-performance learning named-finder, In Proceedings of the Fifth Conference on Applied Natural Language Processing, 1997
[5]
G., Zhou, J. Su, Named Entity Recognition using an HMM-based Chunk Tagger, In 40th Annual Meeting of the Association for Computational Linguistics, 2002.
[6]
F. James, Modified Kneser-Ney Smoothing of n-gram Models. Technical Report TR00-07, RIACS, USRA, 2000.
[7]
K. Nigam and R. Ghani. Analyzing the effectiveness and applicability of co-training. In Proceedings of the Ninth International Conference on Information and Knowledge Management, 2000.
[8]
K. Nigam and R. Ghani. Understanding the Behavior of Co-training. In Proceedings of KDD-2000 Workshop on Text Mining, 2000.
[9]
M. Collins and Y. Singer. Unsupervised models for named entity classification. In Empirical Methods in Natural Language Processing and Very Large Corpora, 1999.
[10]
S. Harabagiu, D. Moldovan, M. Pasca, R. Mihalcea, M. Surdeanu, R. Bunescu, R. Girju, V. Rus and P. Morarescu, FALCON: Boosting Knowledge for Answer Engines, In Proceedings of the Text Retrieval Conference TRECT-9, 2000.

Cited By

View all
  • (2019)Curation Technologies for Cultural Heritage ArchivesProceedings of the 3rd International Conference on Digital Access to Textual Cultural Heritage10.1145/3322905.3322909(117-122)Online publication date: 8-May-2019
  • (2011)Automatic rule learning exploiting morphological features for named entity recognition in TurkishJournal of Information Science10.1177/016555151139857337:2(137-151)Online publication date: 1-Apr-2011

Recommendations

Comments

Information & Contributors

Information

Published In

cover image DL Hosted proceedings
AsianIR '03: Proceedings of the sixth international workshop on Information retrieval with Asian languages - Volume 11
July 2003
175 pages
  • Program Chair:
  • Jun Adachi

Publisher

Association for Computational Linguistics

United States

Publication History

Published: 07 July 2003

Author Tags

  1. HMM
  2. Korean named entity
  3. co-training

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)65
  • Downloads (Last 6 weeks)5
Reflects downloads up to 03 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2019)Curation Technologies for Cultural Heritage ArchivesProceedings of the 3rd International Conference on Digital Access to Textual Cultural Heritage10.1145/3322905.3322909(117-122)Online publication date: 8-May-2019
  • (2011)Automatic rule learning exploiting morphological features for named entity recognition in TurkishJournal of Information Science10.1177/016555151139857337:2(137-151)Online publication date: 1-Apr-2011

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media