| A differential LSI method for document classification |
| Full text |
Pdf
(155 KB)
|
| Source
|
Annual Meeting of the ACL
archive
Proceedings of the sixth international workshop on Information retrieval with Asian languages - Volume 11
table of contents
Sappro, Japan
Pages: 25 - 32
Year of Publication: 2003
|
|
Authors
|
|
Liang Chen
|
University of Northern British Columbia, Prince George, BC, Canada
|
|
Naoyuki Tokuda
|
R & D Center, Sunflare Company, Tokyo, Japan
|
|
Akira Nagai
|
Utsunomiya University, Utsunomiya, Tochigi, Japan
|
|
| Publisher |
Association for Computational Linguistics
Morristown, NJ, USA
|
| Bibliometrics |
Downloads (6 Weeks): 6, Downloads (12 Months): 40, Citation Count: 0
|
|
|
ABSTRACT
We have developed an effective probabilistic classifier for document classification by introducing the concept of the differential document vectors and DLSI (differential latent semantics index) spaces. A simple posteriori calculation using the intra- and extra-document statistics demonstrates the advantage of the DLSI space-based probabilistic classifier over the popularly used LSI space-based classifier in classification performance.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
M. Benkhalifa, A. Bensaid, and A Mouradi. 1999. Text categorization using the semi-supervised fuzzy c-means algorithm. In 18th International Conference of the North American Fuzzy Information Processing Society, pages 561--565.
|
| |
2
|
|
| |
3
|
|
| |
4
|
Scott Deerwester, Susan T. Dumais, Grorge W. Furnas, Thomas K. Landauer, and Richard Harshman. 1990. Indexing by latent semantic analysis. Journal of the American Society for Information Science, 41(6):391--407.
|
| |
5
|
Jennifer Farkas. 1994. Generating document clusters using thesauri and neural networks. In Canadian Conference on Electrical and Computer Engineering, volume 2, pages 710--713.
|
| |
6
|
H. Hyotyniemi. 1996. Text document classification with self-organizing maps. In STeP '96 - Genes, Nets and Symbols. Finnish Artificial Intelligence Conference, pages 64--72.
|
| |
7
|
M. Iwayama and T. Tokunaga. 1995. Hierarchical bayesian clustering for automatic text classification. In Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence, volume 2, pages 1322--1327.
|
| |
8
|
Wai Lam and Kon-Fan Low. 1997. Automatic document classification based on probabilistic reasoning: Model and performance analysis. In Proceedings of the IEEE International Conference on Systems, Man and Cybernetics, volume 3, pages 2719--2723.
|
| |
9
|
|
| |
10
|
Wei Li, Bob Lee, Franl Krausz, and Kenan Sahin. 1991. Text classification by a neural network. In Proceedings of the Twenty-Third Annual Summer Computer Simulation Conference, pages 313--318.
|
| |
11
|
|
| |
12
|
H. P. Luhn. 1958. The automatic creation of literature abstracts. IBM Journal of Research and Development, 2(2):159--165, April.
|
| |
13
|
D. Merkl. 1998. Text classification with self-organizing maps: Some lessons learned. Neurocomputing, 21(1-3):61--77.
|
| |
14
|
|
| |
15
|
|
| |
16
|
|
| |
17
|
V. V. Raghavan and S. K. M. Wong. 1986. A critical analysis of vector space model for information retrieval. Journal of the American Society for Information Science, 37(5):279--87.
|
| |
18
|
|
| |
19
|
|
 |
20
|
|
| |
21
|
L. Sirovich and M. Kirby. 1987. Low-dimensional procedure for the characterization of human faces. Journal of the Optical Society of America A, 4(3):519--524.
|
| |
22
|
Borge Svingen. 1997. Using genetic programming for document classification. In John R. Koza, editor, Late Breaking Papers at the 1997 Genetic Programming Conference, pages 240--245, Stanford University, CA, USA, 13-16 July. Stanford Bookstore.
|
| |
23
|
M. Turk and A. Pentland. 1991. Eigenfaces for recognition. Journal of Cognitive Neuroscience, 3(1):71--86.
|
| |
24
|
|
|