ACM Home Page
Please provide us with feedback. Feedback
Creation of topic map by identifying topic chain in chinese
Full text PdfPdf (179 KB)
Source Document Engineering archive
Proceedings of the 2004 ACM symposium on Document engineering table of contents
Milwaukee, Wisconsin, USA
SESSION: Document creation I table of contents
Pages: 112 - 114  
Year of Publication: 2004
ISBN:1-58113-938-1
Authors
Ching-Long Yeh  Tatung University, Taipei, Taiwan
Yi-Chun Chen  Tatung University, Taipei, Taiwan
Sponsors
ACM: Association for Computing Machinery
SIGWEB: ACM Special Interest Group on Hypertext, Hypermedia, and Web
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 5,   Downloads (12 Months): 55,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
Save this Article to a Binder    Display Formats: BibTex  EndNote ACM Ref   
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1030397.1030420
What is a DOI?

ABSTRACT

XML Topic maps enable multiple concurrent views of sets of information objects and can be used to different applications. For example thesaurus-like interfaces to corpora navigational tools for cross-references or citation systems information filtering or delivering depending on user profiles etc. However to enrich the information of a topic map or to connect with some document's URI is very labor-intensive and time-consuming. To solve this problem we propose an approach based on natural language processing techniques to identify and extract useful information in raw Chinese text. Unlike most traditional approaches to parsing sentences based on the integration of complex linguistic information and domain knowledge we work on the output of a part-of-speech tagger and use shallow parsing instead of complex parsing to identify the topics of sentences. The key elements of the centering model of local discourse coherence are employed to extract structures of discourse segments. We use the local discourse structure to solve the problem of zero anaphora in Chinese and then identify the topic which is the most salient element in a sentence. After we obtain all the topics of a document we may assign this document into a topic node of the topic map and add the information of the document into the topic element simultaneously.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
Pepper Steve and Moore Graham. ed. 2001. XML Topic Maps (XTM) 1.0. TopicMaps.Org Specification.
 
3
Biezunski Michel Bryan Martin and Newcomb Steven R. ed. 1999. ISO/IEC 13250 Topic Maps: Information Technology -- Document Description and Markup Languages.
 
4
Yeh Ching-Long and Chen Yi-Chun. 2003. Zero anaphora resolution in Chinese with partial parsing based on centering theory. In Proceedings of IEEE NLP-KE03 Beijing China.
 
5
Li Charles N. and Thompson Sandra A. 1981. Mandarin Chinese - A Functional Reference Grammar University of California Press.
 
6
 
7
Abney Steven. 1996. Tagging and Partial Parsing. In: Ken Church Steve Young and Gerrit Bloothooft (eds.) Corpus-Based Methods in Language and Speech. An ELSNET volume. Kluwer Academic Publishers Dordrecht.
 
8
Rambow O. (1993). Pragmatic aspects of scrambling and topicalization in German: A Centering Approach. In IRCS Workshop on Centering in Discourse. Univ. of Pennsylvania 1993.

Collaborative Colleagues:
Ching-Long Yeh: colleagues
Yi-Chun Chen: colleagues