skip to main content
10.1145/1871437.1871681acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
poster

Hypergraph-based multilevel matrix approximation for text information retrieval

Published: 26 October 2010 Publication History

Abstract

In Latent Semantic Indexing (LSI), a collection of documents is often pre-processed to form a sparse term-document matrix, followed by a computation of a low-rank approximation to the data matrix. A multilevel framework based on hypergraph coarsening is presented which exploits the hypergraph that is canonically associated with the sparse term-document matrix representing the data. The main goal is to reduce the cost of the matrix approximation without sacrificing accuracy. Because coarsening by multilevel hypergraph techniques is a form of clustering, the proposed approach can be regarded as a hybrid of factorization-based LSI and clustering-based LSI. Experimental results indicate that our method achieves good improvement of the retrieval performance at a reduced cost

References

[1]
M. W. Berry and M. Browne. Understanding Search Engines. SIAM Publiscations, 1999.
[2]
U. V. Catalyurek and C. Aykanat. Hypergraph-partitioning based decomposition for parallel sparse-matrix vector multiplication. IEEE Transaction on Parallel and Distributed Systems, 10(7):673--693, 1999.
[3]
E. Chisholm and T. G. Kolda. New term weighting formulas for the vector space method in information retrieval. Technical Report ORNL-TM-13756, Oak Ridge National Laboratory, Oak Ridge, TN, 1999.
[4]
S. Deerwester, S. T. Dumais, G. W. Furnas, T. K. Landauer, and R. Harshman. Indexing by latent semantic analysis. J. Soc. Inf. Sci., 41(6):391--407, 1990.
[5]
K. Devine, E. G. Boman, R. Heaphy, R. Bisseling, and U. V. Catalyurek. Parallel hypergraph partitioning for scientific computing. In 20th International Parallel and Distributed Processing Symposium (IPDPS), 2006.
[6]
I. S. Dhillon and D. S. Modha. Concept decompositions for large sparse text data using clustering. Machine Learning, 42:143--175, 2001.
[7]
L. Eldén. Matrix Methods in Data Mining and Pattern e Recognition. SIAM Publications, 2007.
[8]
N. Jardine and C. J. van Rijsbergen. The use of hierarchic clustering in information retrieval. Information Storage and Retrieval, 7(5), 1971.
[9]
G. Karypis, R. Aggarwal, V. Kumar, and S. Shekhar. Multilevel hypergraph partitioning: Applications in VLSI domain. IEEE Trans. VLSI Sys., 7(1):69--79, 1999.
[10]
G. Karypis and V. Kumar. Multilevel k-way hypergraph partitioning. VLSI Design, 11(3):285--300, 2000.
[11]
T. G. Kolda and D. P. O'Leary. A semi-discrete matrix decomposition for latent semantic indexing in information retrieval. ACM Trans. Info. Sys., 16(4):322--346, 1998.
[12]
T. G. Kolda and D. P. O'Leary. Algorithm 805: Computation and uses of the semidiscrete matrix decomposition. ACM Tran. Math. Soft., 26(3):415--435, 2000.
[13]
D. D. Lee and H. S. Seung. Learning the parts of objects by non-negative matrix factorization. Nature, 401:788--791, 1999.
[14]
C.-J. Lin. Projected gradient methods for non-negative matrix factorization. Neural Computation, 19(10):2756--2779, 2007.
[15]
G. Salton and C. Buckley. Term-weighting approaches in automatic text retrieval. Inf. Process. Manage., 24(5):513--523, 1988.
[16]
B. Vastenhouw and R. H. Bisseling. A two-dimensional data distribution method for parallel sparse matrix-vector multiplication. SIAM review, 47(1):67--95, 2005.

Cited By

View all
  • (2022)Graph coarsening: from scientific computing to machine learningSeMA Journal10.1007/s40324-021-00282-x79:1(187-223)Online publication date: 10-Jan-2022
  • (2019)Graph-Based Semantic Learning, Representation and Growth from Text: A Systematic Review2019 IEEE 13th International Conference on Semantic Computing (ICSC)10.1109/ICOSC.2019.8665592(118-123)Online publication date: Jan-2019
  • (2019)Sampling and multilevel coarsening algorithms for fast matrix approximationsNumerical Linear Algebra with Applications10.1002/nla.223426:3Online publication date: 6-Mar-2019

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
CIKM '10: Proceedings of the 19th ACM international conference on Information and knowledge management
October 2010
2036 pages
ISBN:9781450300995
DOI:10.1145/1871437
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 26 October 2010

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. latent semantic indexing
  2. low-rank matrix approximation
  3. multilevel hypergraph partitioning
  4. text information retrieval

Qualifiers

  • Poster

Conference

CIKM '10

Acceptance Rates

Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

CIKM '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)2
  • Downloads (Last 6 weeks)1
Reflects downloads up to 20 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2022)Graph coarsening: from scientific computing to machine learningSeMA Journal10.1007/s40324-021-00282-x79:1(187-223)Online publication date: 10-Jan-2022
  • (2019)Graph-Based Semantic Learning, Representation and Growth from Text: A Systematic Review2019 IEEE 13th International Conference on Semantic Computing (ICSC)10.1109/ICOSC.2019.8665592(118-123)Online publication date: Jan-2019
  • (2019)Sampling and multilevel coarsening algorithms for fast matrix approximationsNumerical Linear Algebra with Applications10.1002/nla.223426:3Online publication date: 6-Mar-2019

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media