research-article

An interpretable method for text summarization based on simplicial non-negative matrix factorization

Authors:
Nguyen Kim Anh

Hanoi University of Science and Technology

Hanoi University of Science and Technology
View Profile

,
Nguyen Khac Toi

Hanoi University of Science and Technology

Hanoi University of Science and Technology
View Profile

,
Ngo Van Linh

Hanoi University of Science and Technology

Hanoi University of Science and Technology
View Profile

SoICT '14: Proceedings of the 5th Symposium on Information and Communication TechnologyDecember 2014Pages 57–64https://doi.org/10.1145/2676585.2676604

Published:04 December 2014Publication History

SoICT '14: Proceedings of the 5th Symposium on Information and Communication Technology

Pages 57–64

ABSTRACT

Automatic text summarization plays an important role in information retrieval and text mining. Furthermore, it provides an useful solution to the information overload problem. In this paper, we propose a simplicial NMF-based unsupervised generic document summarization method which can inherit some advantages of simplicial NMF such as easy interpretability, low complexity, convexity and sparsity. By focusing on the major topics contained within every sentence as well as entire document, our method generates better summaries with less repetition. The effectiveness of our method is proved by experimental results. On the summarization performance, our approach obtains mostly higher ROUGE scores than NMF-based method.

References

Kenneth L Clarkson. Coresets, sparse greedy approximation, and the frank-wolfe algorithm. ACM Transactions on Algorithms (TALG), 6(4): 63, 2010. Google ScholarDigital Library
William B Frakes and Ricardo Baeza-Yates. Information retrieval: data structures and algorithms. 1992. Google ScholarDigital Library
Yihong Gong and Xin Liu. Generic text summarization using relevance measure and latent semantic analysis. In Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval, pages 19--25. ACM, 2001. Google ScholarDigital Library
Patrik O Hoyer. Non-negative matrix factorization with sparseness constraints. The Journal of Machine Learning Research, 5: 1457--1469, 2004. Google ScholarDigital Library
Canasai Kruengkrai and Chuleerat Jaruskulchai. Generic text summarization using local and global properties of sentences. In Web Intelligence, 2003. WI 2003. Proceedings. IEEE/WIC International Conference on, pages 201--206. IEEE, 2003. Google ScholarDigital Library
Ju-Hong Lee, Sun Park, Chan-Min Ahn, and Daeho Kim. Automatic generic document summarization based on non-negative matrix factorization. Information Processing & Management, 45(1): 20--34, 2009. Google ScholarDigital Library
Chin-Yew Lin. Looking for a few good metrics: Automatic summarization evaluation-how many samples are enough. In Proceedings of the NTCIR Workshop, volume 4, pages 1--10, 2004.Google Scholar
Inderjeet Mani and Mark T Maybury. Advances in automatic text summarization, volume 293. MIT Press, 1999. Google ScholarDigital Library
Rada Mihalcea. Graph-based ranking algorithms for sentence extraction, applied to text summarization. In Proceedings of the ACL 2004 on Interactive poster and demonstration sessions, page 20. Association for Computational Linguistics, 2004. Google ScholarDigital Library
Gabriel Murray, Steve Renals, and Jean Carletta. Extractive summarization of meeting recordings. In INTERSPEECH, pages 593--596, 2005.Google ScholarCross Ref
Ani Nenkova, Lucy Vanderwende, and Kathleen McKeown. A compositional context sensitive multi-document summarizer: exploring the factors that influence summarization. In Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, pages 573--580. ACM, 2006. Google ScholarDigital Library
Duy Khuong Nguyen, Khoat Than, and Tu Bao Ho. Simplicial nonnegative matrix factorization. In Computing and Communication Technologies, Research, Innovation, and Vision for the Future (RIVF), 2013 IEEE RIVF International Conference on, pages 47--52. IEEE, 2013.Google Scholar
Makbule Gulcin Ozsoy, Ilyas Cicekli, and Ferda Nur Alpaslan. Text summarization of turkish texts using latent semantic analysis. In Proceedings of the 23rd international conference on computational linguistics, pages 869--876. Association for Computational Linguistics, 2010. Google ScholarDigital Library
Josef Steinberger and Karel Ježek. Text summarization and singular value decomposition. In Advances in Information Systems, pages 245--254. Springer, 2005. Google ScholarDigital Library
Hongyuan Zha. Generic summarization and keyphrase extraction using mutual reinforcement principle and sentence clustering. In Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval, pages 113--120. ACM, 2002. Google ScholarDigital Library

Index Terms

An interpretable method for text summarization based on simplicial non-negative matrix factorization
1. Information systems
  1. Information retrieval

Recommendations

Enhancing extractive summarization using non-negative matrix factorization with semantic aspects and sentence features
SoICT '17: Proceedings of the 8th International Symposium on Information and Communication Technology

The main task in extractive text summarization is to evaluate the important of sentences in a document. This paper aims at improving the quality of an unsupervised summarization method, i.e. non-negative matrix factorization, by using sentence features ...
Read More
Automatic generic document summarization based on non-negative matrix factorization

In existing unsupervised methods, Latent Semantic Analysis (LSA) is used for sentence selection. However, the obtained results are less meaningful, because singular vectors are used as the bases for sentence selection from given documents, and singular ...
Read More
Multi-document Hyperedge-based Ranking for Text Summarization
CIKM '14: Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management

In a multi-document settings, graph-based extractive summarization approaches build a similarity graph out of sentences in each cluster of documents then use graph centrality approaches to measure the importance of sentences. The similarity is computed ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SoICT '14: Proceedings of the 5th Symposium on Information and Communication Technology
December 2014
304 pages
ISBN:9781450329309
DOI:10.1145/2676585
Conference Chair:
Nguyen Trong Giang
HUST, Vietnam
,
General Chairs:
Huynh Quyet Thang
HUST, Vietnam
,
Ismal Khalil
Austria
,
Program Chairs:
Ngo Hong Son
HUST, Vietnam
,
Yves Deville
UCL, Belgium
,
Marc Bui
EPHE, France
Copyright © 2014 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 4 December 2014
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
maximal marginal relevance
simplicial non-negative matrix factorization
text summarization
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate147of318submissions,46%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 2
  Total Citations
  View Citations
- 104
  Total Downloads
- Downloads (Last 12 months)2
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

An interpretable method for text summarization based on simplicial non-negative matrix factorization

SoICT '14: Proceedings of the 5th Symposium on Information and Communication Technology

ABSTRACT

References

Cited By

Index Terms

Recommendations

Enhancing extractive summarization using non-negative matrix factorization with semantic aspects and sentence features

Automatic generic document summarization based on non-negative matrix factorization

Multi-document Hyperedge-based Ranking for Text Summarization

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

An interpretable method for text summarization based on simplicial non-negative matrix factorization

SoICT '14: Proceedings of the 5th Symposium on Information and Communication Technology

ABSTRACT

References

Cited By

Index Terms

Recommendations

Enhancing extractive summarization using non-negative matrix factorization with semantic aspects and sentence features

Automatic generic document summarization based on non-negative matrix factorization

Multi-document Hyperedge-based Ranking for Text Summarization

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media