Article

Free Access

Research on extracting subject from Chinese text (poster session)

Authors:
Han Kesong

Shanghai Jiaotong University, Department of Computer Science, Shanghai, P.R. China

Shanghai Jiaotong University, Department of Computer Science, Shanghai, P.R. China
View Profile

,
Wang Yongcheng

Shanghai Jiaotong University, Department of Computer Science, Shanghai, P.R. China

Shanghai Jiaotong University, Department of Computer Science, Shanghai, P.R. China
View Profile

,
Wu Fangfang

Shanghai Jiaotong University, Department of Computer Science, Shanghai, P.R. China

Shanghai Jiaotong University, Department of Computer Science, Shanghai, P.R. China
View Profile

IRAL '00: Proceedings of the fifth international workshop on on Information retrieval with Asian languagesNovember 2000Pages 211–212https://doi.org/10.1145/355214.355249

Published:01 November 2000Publication History

IRAL '00: Proceedings of the fifth international workshop on on Information retrieval with Asian languages

Pages 211–212

ABSTRACT

Because of the agility and diversity of natural languages, extracting the subject of text is one of the most difficult but important tasks in natural language processing (NLP). Due to the unique linguistics and grammar structures of Chinese, we now can only adopt non-semantic based approaches to extract subject from Chinese text. Three different approaches of extracting subject from Chinese text are presented in this paper. The first one is based a component-word dictionary, the second one is based on a subject-word dictionary and the third one is based on a statistic method. We introduce the process of the approaches. To test our approaches, we develop three independent systems and design a comparison experiment. The experimental results are illuminating and inspiring: every system can extract the text's subject to some extent, however, we may need combine these approaches to get a better one.

References

1.Text Mining Technology: Turning Information into Knowledge, A while paper from IBM. IBM 1998.Google Scholar
2.McKeown, Radev. Generating summaries of multiple news articles. SIGIR 95 proceeding. Google ScholarDigital Library
3.G. Salton: Automatic Text Processing: The Transformation, Analysis, and Retrieval of Information by Computer, Addison Wesley, 1989. Google ScholarDigital Library
4.Regina Barzilay, Michael Elhadad. Using Lexical Chains for Text Summarization. http://www.cs.bgu.ac.il/elhadad.Google Scholar

Research on extracting subject from Chinese text (poster session)

Recommendations

A fast algorithm for detection of Chinese lihe words (poster session)
IRAL '00: Proceedings of the fifth international workshop on on Information retrieval with Asian languages

In the Chinese linguistic study, there is a kind of words that are called lihe words. These words can be separated by certain modifier words such as adjective, quantifier, and so on, but the meaning of the word is not changed. If we separated the part ...
Read More
Automated Extraction of Lexicon Applied both to Chinese and Japanese Corpora
ACSAT '12: Proceedings of the 2012 International Conference on Advanced Computer Science Applications and Technologies

A novel statistical approach is described, enabling the automated extraction of large word lists from unsegmented corpora without reliance on existing dictionaries. The main contribution of this approach includes the following two points: First, it's ...
Read More
Pseudo-siamese networks with lexicon for Chinese short text matching

Short text matching is one of the fundamental technologies in natural language processing. In previous studies, most of the text matching networks are initially designed for English text. The common approach to applying them to Chinese is segmenting each ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
IRAL '00: Proceedings of the fifth international workshop on on Information retrieval with Asian languages
November 2000
220 pages
ISBN:1581133006
DOI:10.1145/355214
Chairmen:
Kam-Fai Wong
Chinese Univ. of Hong Kong, Hong Kong, China
,
Dik L. Lee
Hong Kong Univ. of Science and Technology, Hong Kong, China
,
Jong-Hyeok Lee
Pohang Univ. of Science and Technology, Korea
Copyright © 2000 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 1 November 2000
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Chinese text
component-word
statistics
subject extracting
subject-word
Qualifiers
- Article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 261
  Total Downloads
- Downloads (Last 12 months)16
- Downloads (Last 6 weeks)2
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Research on extracting subject from Chinese text (poster session)

IRAL '00: Proceedings of the fifth international workshop on on Information retrieval with Asian languages

ABSTRACT

References

Cited By

Recommendations

A fast algorithm for detection of Chinese lihe words (poster session)

Automated Extraction of Lexicon Applied both to Chinese and Japanese Corpora

Pseudo-siamese networks with lexicon for Chinese short text matching

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Research on extracting subject from Chinese text (poster session)

IRAL '00: Proceedings of the fifth international workshop on on Information retrieval with Asian languages

ABSTRACT

References

Cited By

Recommendations

A fast algorithm for detection of Chinese lihe words (poster session)

Automated Extraction of Lexicon Applied both to Chinese and Japanese Corpora

Pseudo-siamese networks with lexicon for Chinese short text matching

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media