ACM Home Page
Please provide us with feedback. Feedback
A differential LSI method for document classification
Full text PdfPdf (155 KB)
Source Annual Meeting of the ACL archive
Proceedings of the sixth international workshop on Information retrieval with Asian languages - Volume 11 table of contents
Sappro, Japan
Pages: 25 - 32  
Year of Publication: 2003
Authors
Liang Chen  University of Northern British Columbia, Prince George, BC, Canada
Naoyuki Tokuda  R & D Center, Sunflare Company, Tokyo, Japan
Akira Nagai  Utsunomiya University, Utsunomiya, Tochigi, Japan
Publisher
Association for Computational Linguistics  Morristown, NJ, USA
Bibliometrics
Downloads (6 Weeks): 6,   Downloads (12 Months): 40,   Citation Count: 0
Additional Information:

abstract   references   collaborative colleagues  

Tools and Actions: Review this Article  
Save this Article to a Binder    Display Formats: BibTex  EndNote ACM Ref   
DOI Bookmark: 10.3115/1118935.1118939

ABSTRACT

We have developed an effective probabilistic classifier for document classification by introducing the concept of the differential document vectors and DLSI (differential latent semantics index) spaces. A simple posteriori calculation using the intra- and extra-document statistics demonstrates the advantage of the DLSI space-based probabilistic classifier over the popularly used LSI space-based classifier in classification performance.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
M. Benkhalifa, A. Bensaid, and A Mouradi. 1999. Text categorization using the semi-supervised fuzzy c-means algorithm. In 18th International Conference of the North American Fuzzy Information Processing Society, pages 561--565.
 
2
 
3
 
4
Scott Deerwester, Susan T. Dumais, Grorge W. Furnas, Thomas K. Landauer, and Richard Harshman. 1990. Indexing by latent semantic analysis. Journal of the American Society for Information Science, 41(6):391--407.
 
5
Jennifer Farkas. 1994. Generating document clusters using thesauri and neural networks. In Canadian Conference on Electrical and Computer Engineering, volume 2, pages 710--713.
 
6
H. Hyotyniemi. 1996. Text document classification with self-organizing maps. In STeP '96 - Genes, Nets and Symbols. Finnish Artificial Intelligence Conference, pages 64--72.
 
7
M. Iwayama and T. Tokunaga. 1995. Hierarchical bayesian clustering for automatic text classification. In Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence, volume 2, pages 1322--1327.
 
8
Wai Lam and Kon-Fan Low. 1997. Automatic document classification based on probabilistic reasoning: Model and performance analysis. In Proceedings of the IEEE International Conference on Systems, Man and Cybernetics, volume 3, pages 2719--2723.
 
9
 
10
Wei Li, Bob Lee, Franl Krausz, and Kenan Sahin. 1991. Text classification by a neural network. In Proceedings of the Twenty-Third Annual Summer Computer Simulation Conference, pages 313--318.
 
11
 
12
H. P. Luhn. 1958. The automatic creation of literature abstracts. IBM Journal of Research and Development, 2(2):159--165, April.
 
13
D. Merkl. 1998. Text classification with self-organizing maps: Some lessons learned. Neurocomputing, 21(1-3):61--77.
 
14
 
15
 
16
 
17
V. V. Raghavan and S. K. M. Wong. 1986. A critical analysis of vector space model for information retrieval. Journal of the American Society for Information Science, 37(5):279--87.
 
18
 
19
20
 
21
L. Sirovich and M. Kirby. 1987. Low-dimensional procedure for the characterization of human faces. Journal of the Optical Society of America A, 4(3):519--524.
 
22
Borge Svingen. 1997. Using genetic programming for document classification. In John R. Koza, editor, Late Breaking Papers at the 1997 Genetic Programming Conference, pages 240--245, Stanford University, CA, USA, 13-16 July. Stanford Bookstore.
 
23
M. Turk and A. Pentland. 1991. Eigenfaces for recognition. Journal of Cognitive Neuroscience, 3(1):71--86.
 
24
Collaborative Colleagues:
Liang Chen: colleagues
Naoyuki Tokuda: colleagues
Akira Nagai: colleagues