ACM Home Page
Please provide us with feedback. Feedback
A practical web-based approach to generating topic hierarchy for text segments
Full text PdfPdf (351 KB)
Source Conference on Information and Knowledge Management archive
Proceedings of the thirteenth ACM international conference on Information and knowledge management table of contents
Washington, D.C., USA
SESSION: IR-2 (information retrieval): web information retrieval table of contents
Pages: 127 - 136  
Year of Publication: 2004
ISBN:1-58113-874-1
Authors
Shui-Lung Chuang  Institute of Information Science, Academia Sinica, Taiwan, R.O.C.
Lee-Feng Chien  Institute of Information Science, Academia Sinica, Taiwan, R.O.C.
Sponsors
SIGIR: ACM Special Interest Group on Information Retrieval
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 12,   Downloads (12 Months): 108,   Citation Count: 5
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
Save this Article to a Binder    Display Formats: BibTex  EndNote ACM Ref   
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1031171.1031193
What is a DOI?

ABSTRACT

It is crucial in many information systems to organize short text segments, such as keywords in documents and queries from users, into a well-formed topic hierarchy. In this paper, we address the problem of generating topic hierarchies for diverse text segments with a general and practical approach that uses the Web as an additional knowledge source. Unlike long documents, short text segments typically do not contain enough information to extract reliable features. This work investigates the possibilities of using highly ranked search-result snippets to enrich the representation of text segments. A hierarchical clustering algorithm is then applied to create the hierarchical topic structure of text segments. Different from traditional clustering algorithms, which tend to produce cluster hierarchies with a very unnatural shape, the approach tries to produce a more natural and comprehensive hierarchy. Extensive experiments were conducted on different domains of text segments. The obtained results have shown the potential of the proposed approach, which is believed able to benefit many information systems.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
2
 
3
 
4
C. Buckley, G. Salton, and J. Allan. Automatic retrieval with locality information using smart. In Proceedings of the First Text REtrieval Conference (TREC-1), pages 59--72, 1992.
 
5
6
7
 
8
9
10
 
11
S. Johansson, E. Atwell, R. Garside, and G. Leech. THE TAGGED LOB CORPUS: Users' Manual, 1986.
12
13
14
15
 
16
G. W. Milligan and M. C. Cooper. An examination of procedures for detecting the number of clusters in a data set. Psychometrika, 50:159--179, 1985.
 
17
B. Mirkin. Mathematical Classification and Clustering. Kluwer, 1996.
 
18
 
19
 
20
21
22
 
23
M. Suan N. M. Semi-automatic taxonomy for efficient information searching. In Proceedings of the 2nd International Conference on Information Technology for Application, 2004.
 
24
D. Sullivan. Document warehousing & content management: Poor search quality in your enterprise information portal? DM Review, January 2002.
 
25
26
 
27
28
 
29
30


Collaborative Colleagues:
Shui-Lung Chuang: colleagues
Lee-Feng Chien: colleagues