skip to main content
10.1145/3155133.3155177acmotherconferencesArticle/Chapter ViewAbstractPublication PagessoictConference Proceedingsconference-collections
research-article

Phrasal Graph-based Method for Abstractive Vietnamese Paragraph Compression

Authors Info & Claims
Published:07 December 2017Publication History

ABSTRACT

Text compression is the task of identifying the main information in the source text to form a short single sentence. A broad approach is to find a path containing common vertices in the word graph model. The first issue of this approach is that the path finding algorithm can separate words from the phrase expressing a content. This leads to create new sentences having different meaning from the original ones. The second issue is that when an information is expressed by different words or phrases, called co-reference situations. Due to lacking of mechanism for handling this situation, the compression will be missing information. We propose in this paper a method to overcome the above issues. The core of new method is the improved graph model in which each vertex illustrates a phrase with its corresponding Part-of-Speech label. The intersection vertices of branches are results of mechanism for handling co-references. The compressing algorithm reduces the graph and forms the final sentence. We use ROUGE measure to compare with two word graph-based baselines. The experiment result shows that our method creates short sentences containing rich information.

References

  1. A. Khan and N. Salim. 2014. A Review on Abstractive Summarization Methods. Journal of Theoretical and Applied Information Technology 59, 1 (2014), 64--72.Google ScholarGoogle Scholar
  2. B. Santorini. 1990. Part-of-speech Tagging Guidelines for the Penn Treebank Project. Technical Report MS-CIS- 90-47. Department of Computer and Information Science, University of Pennsylvania.Google ScholarGoogle Scholar
  3. C. F. Greenbacker. 2011. Towards a framework for abstractive summarization of multimodal documents. In ACL HLT. 75. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. C. S. Lee, Z. W. Jian and L. K. Huang. 2005. A Fuzzy Ontology and Its Application to News Summarization. IEEE Transaction on Systems, Man and Cybernetics, Part B: Cybernetics 35, 5 (2005), 859--880. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. C. S. Saranyamol and L. Sindhu. 2014. A Survey on Automatic Text Summarization. International Journal of Computer Science and Information Technologies 5, 6 (2014), 7889--7893.Google ScholarGoogle Scholar
  6. C. Y. Lin. 2004. ROUGE: A Package for Automatic Evaluation of Summaries. In Proceeding of the Workshop on Text Summarization Branches Out, Post-Conference Workshop of ACL 2004. Barcelona, Spain.Google ScholarGoogle Scholar
  7. D. Das and A. F. T. Martins. 2007. A survey on automatic text summarization. Language Technologies Institute, Carnegie Mellon University.Google ScholarGoogle Scholar
  8. E. Lloret. 2008. Text summarization: an overview. Paper supported by the Spanish Government under the project TEXT-MESS (TIN2006-15265- C06-01).Google ScholarGoogle Scholar
  9. E. Lloret and M. Palomar. 2011. Analyzing the Use of Word Graphs for Abstractive Text Summarization. In Proceeding of The First International Conference on Advances in Information Mining and Management.Google ScholarGoogle Scholar
  10. E. Krahmer, E. Marsi and Paul van Pelt. 2008. Query-based sentence fusion is better defined and leads to more preferred results than generic sentence fusion. In Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics on Human Language Technologies, Short Papers (Companion Volume). Columbus, Ohio, USA, June 2008, 193--196. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. F. Boudin and E. Morin. 2013. Keyphrase extraction for n-best reranking in multi-sentence compression. In Proceeding of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT 2013). Atlanta, Georgia, 298--305.Google ScholarGoogle Scholar
  12. F. Cornish. 2009. Inter-sentential anaphora and coherence relations in discourse: a perfect match. Language Sciences, 31, 5 (2009), 572--592.Google ScholarGoogle ScholarCross RefCross Ref
  13. H. P. Luhn.1958. The automatic creation of literature abstracts. IBM Journal of Research Development 2, 2 (1958), 159--165. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. H. P. Edmundson. 1969. New methods in automatic extracting. Journal of the ACM 1, 2 (1969), 264--285. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. H. T. Le and T. M. Le. 2013. An approach to Abstractive Text Summarization. In Proceeding of 5th International Conference of Soft Computing and Pattern Recognition (SoCPaR 2013). Hanoi, Vietnam. 372--377.Google ScholarGoogle Scholar
  16. H. X. Cao. 2006. Tiêng Viêt: So' thao ngũ pháp chũc năng {Vietnamese: Brief of Functional Grammar}. Nhà xuât bân giáo dũc {Education Publisher}.Google ScholarGoogle Scholar
  17. I. F. Moawad and M. Aref. 2012. Semantic graph reduction approach for abstractive Text Summarization. In Proceeding of 7th International Conference on Computer Engineering & Systems (ICCES). 132--138.Google ScholarGoogle Scholar
  18. I. Mani. 2001. Automatic Summarization. John Benjamins Publishing Company.Google ScholarGoogle Scholar
  19. J. Clarke and M. Lapata. 2006a. Constraint-Based Sentence Compression: An Integer Programming Approach. In Proceedings of the COLING/ACL 2006 Main Conference Poster Session. Sydney, Australia, 144--151. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. J. Clarke and M. Lapata. 2006b. Models for sentence compression: A comparison across domains, training requirements and evaluation measures. In Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics, Sydney, Australia, 17-8 July, 377--384. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. J. Clarke and M. Lapata. 2008. Global inference for sentence compression: An integer linear programming approach. Journal of Artificial Intelligence Research, 31 (2008), 399--429. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. K. A. Ganesan, C. X. Zhai and J. Han. 2010. Opinosis: A Graph-Based Approach to Abstractive Summarization of Highly Redundant Opinions. In Proceedings of the 23rd International Conference on Computational Linguistics (COLING 2010). Beijing, China. 340--348. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. K. Filippova. 2010. Multi-Sentence Compression: Finding Shortest Paths in Word Graphs. In Proceeding of the 23rd International Conference on Computational Linguistics (COLING 2010). Beijing, China. 322--330. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. K. Filippova and M. Strube. 2008a. Dependency Tree Based Sentence Compression. In Proceeding of the 5th International Natural Language Generation Conference. Salt Fork, Ohio. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. K. Filippova and M. Strube. 2008b. Sentence Fusion via Dependency Graph Compression. In Proceeding of the Conference on Empirical Methods in Natural Language Processing. Honolulu, Hawaii. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. K. Jezek and J. Steinberger. 2008. Automatic Text summarization. Vaclav Snasel (Ed.): Znalosti 2008, ISBN 978-80-227-2827-0, HIT STU Bratislava. Ustav Informatiky a softveroveho inzinierstva, 1--12.Google ScholarGoogle Scholar
  27. K. S. Jones. 2007. Automatic summarising: a review and discussion of the state of the art. Technical Report 679. Computer Laboratory, University of Cambridge.Google ScholarGoogle Scholar
  28. N. R. Kasture, N. Yargal, N. N. Singh, N. Kulkarni and V. Mathur. 2014. A Survey on Methods of Abstractive Text Summarization. International Journal for Research in Merging Science and Technology 1, 6 (2014), 53--57.Google ScholarGoogle Scholar
  29. P. Baxendale. 1958. Machine-made index for technical literature -- an experiment. IBM Journal of Research Development 2, 4 (1958), 354--361. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. P. E. Genest and G. Lapalme. 2010. Text Generation for Abstractive Summarization. In Proceedings of the 3rd Text Analysis Conference.Google ScholarGoogle Scholar
  31. P. E. Genest and G. Lapalme. 2011. Framework for Abstractive Summarization using Text-to-Text Generation. In Workshop on Monolingual Text-To-Text Generation, pages 64--73. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics, Portland, Oregon, 24 June 2011, 64--73. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. P. E. Genest and G. Lapalme. 2012. Fully Abstractive Approach to Guided Summarization. In Proceeding of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers -- Volum 2. Jeju Island, Korea, 354--358. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. R. Barzilay, K. R. McKeown and M. Elhadad. 1999. Information fusion in the context of multi-document summarization. In Proceeding of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics. 550--557. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. R. Barzilay and K. R. McKeown. 2005. Sentence Fusion for Multi-document News Summarization. Computational Linguistics 31, 3 (2005), 297--328.Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. S. M. Harabagiu and F. Lacatusu. 2002. Generating single and multi-document summaries with gistexter. In Proceeding of Document Understanding Conferences.Google ScholarGoogle Scholar
  36. T. Tran and D. T. Nguyen. 2013a. A Solution for Resolving Inter-sentential Anaphoric Pronouns for Vietnamese Paragraphs Composing Two Single Sentences. In Proceeding of the 5th International Conference of Soft Computing and Pattern Recognition (SoCPaR 2013). Hanoi, Vietnam, 172--177.Google ScholarGoogle Scholar
  37. T. Tran and D. T. Nguyen. 2013b. Improve effectiveness resolving some inter-sentential anaphoric pronouns indicating human objects in Vietnamese paragraphs using finding heuristics with priority. In Proceedings of the 10th RIVF International Conference on Computing and Communication Technologies--Research, Innova- tion, and Vision for the Future (RIVF'13). Hanoi, Vietnam. 109--114.Google ScholarGoogle Scholar
  38. T. Tran and D. T. Nguyen. 2006. Môt Phũong Pháp Dũa Trên Luât đe Chuyên Đoi Văn Bân Tiêng Viêt vê DRS (Discourse Representation Structure) {A Rule-based Method for Transforming Vietnamese Paragraphs into DRS (Discourse Representation Structure)}. Chuyên san Công nghê Thông tin và Truyên thông, Tâp chí Khoa hôc và Ky thuât, Hôc viên Ky thuât quân sũ {Journal of Science and Technology: The Section on Information and Communication Technology (LQDTU-JICT)}, 9 (2016), 61--83.Google ScholarGoogle Scholar
  39. V. Gupta and G. S. Lehal. 2010. A survey of text summarization extractive techniques. Journal of Emerging Technology in Web Intelligence 2, 3 (2010). 258--268.Google ScholarGoogle Scholar

Index Terms

  1. Phrasal Graph-based Method for Abstractive Vietnamese Paragraph Compression

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Other conferences
        SoICT '17: Proceedings of the 8th International Symposium on Information and Communication Technology
        December 2017
        486 pages
        ISBN:9781450353281
        DOI:10.1145/3155133

        Copyright © 2017 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 7 December 2017

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Research
        • Refereed limited

        Acceptance Rates

        Overall Acceptance Rate147of318submissions,46%
      • Article Metrics

        • Downloads (Last 12 months)0
        • Downloads (Last 6 weeks)0

        Other Metrics

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader