skip to main content
10.1145/3269206.3271722acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

Multi-Source Pointer Network for Product Title Summarization

Published:17 October 2018Publication History

ABSTRACT

In this paper, we study the product title summarization problem in E-commerce applications for display on mobile devices. Comparing with conventional sentence summarization, product title summarization has some extra and essential constraints. For example, factual errors or loss of the key information are intolerable for E-commerce applications. Therefore, we abstract two more constraints for product title summarization: (i) do not introduce irrelevant information; (ii) retain the key information (e.g., brand name and commodity name). To address these issues, we propose a novel multi-source pointer network by adding a new knowledge encoder for pointer network. The first constraint is handled by pointer mechanism. For the second constraint, we restore the key information by copying words from the knowledge encoder with the help of the soft gating mechanism. For evaluation, we build a large collection of real-world product titles along with human-written short titles. Experimental results demonstrate that our model significantly outperforms the other baselines. Finally, online deployment of our proposed model has yielded a significant business impact, as measured by the click-through rate.

References

  1. Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2015. Neural machine translation by jointly learning to align and translate. In Proceedings of ICLR .Google ScholarGoogle Scholar
  2. Satanjeev Banerjee and Alon Lavie. 2005. METEOR: An Automatic Metric for MT Evaluation with Improved Correlation with Human Judgments. In Proceedings of the ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization . Association for Computational Linguistics, Michigan, 65--72.Google ScholarGoogle Scholar
  3. Michele Banko, Vibhu O. Mittal, and Michael J. Witbrock. 2000. Headline Generation Based on Statistical Translation. In Proceedings of ACL . Association for Computational Linguistics, 318--325. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Abhijnan Chakraborty, Bhargavi Paranjape, Sourya Kakarla, and Niloy Ganguly. 2016. Stop Clickbait: Detecting and preventing clickbaits in online news media. In Proceedings of ASONAM . 9--16. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Jianpeng Cheng and Mirella Lapata. 2016. Neural Summarization by Extracting Sentences and Words. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) . Association for Computational Linguistics, Berlin, Germany, 484--494.Google ScholarGoogle ScholarCross RefCross Ref
  6. Kyunghyun Cho, Bart van Merrienboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. 2014. Learning Phrase Representations using RNN Encoder--Decoder for Statistical Machine Translation. In Proceedings of EMNLP . Association for Computational Linguistics, Doha, Qatar, 1724--1734.Google ScholarGoogle ScholarCross RefCross Ref
  7. Sumit Chopra, Michael Auli, and Alexander M. Rush. 2016. Abstractive Sentence Summarization with Attentive Recurrent Neural Networks. In Proceedings of NAACL. Association for Computational Linguistics, San Diego, California, 93--98.Google ScholarGoogle Scholar
  8. Trevor Cohn and Mirella Lapata. 2008. Sentence Compression Beyond Word Deletion. In Proceedings of COLING . Manchester, UK, 137--144. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Trevor Cohn and Mirella Lapata. 2013. An Abstractive Approach to Sentence Compression. ACM Trans. Intell. Syst. Technol. , Vol. 4, 3, Article 41 (July 2013), bibinfonumpages35 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Bonnie Dorr, David Zajic, and Richard Schwartz. 2003. Hedge Trimmer: A Parse-and-Trim Approach to Headline Generation. In Proceedings of the HLT-NAACL 03 Text Summarization Workshop. 1--8. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. John Duchi, Elad Hazan, and Yoram Singer. 2011. Adaptive Subgradient Methods for Online Learning and Stochastic Optimization. J. Mach. Learn. Res. , Vol. 12 (July 2011), 2121--2159. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Mihail Eric and Christopher Manning. 2017. A Copy-Augmented Sequence-to-Sequence Architecture Gives Good Performance on Task-Oriented Dialogue. In Proceedings of EACL . Association for Computational Linguistics, Valencia, Spain, 468--473.Google ScholarGoogle ScholarCross RefCross Ref
  13. Katja Filippova, Enrique Alfonseca, Carlos A. Colmenares, Lukasz Kaiser, and Oriol Vinyals. 2015. Sentence Compression by Deletion with LSTMs. In Proceedings of EMNLP . Association for Computational Linguistics, Lisbon, Portugal, 360--368.Google ScholarGoogle ScholarCross RefCross Ref
  14. Katja Filippova and Michael Strube. 2008. Dependency Tree Based Sentence Compression. In Proceedings of INLG . Association for Computational Linguistics, Salt Fork, Ohio, 25--32. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Dimitrios Galanis and Ion Androutsopoulos. 2010. An extractive supervised two-stage method for sentence compression. In Proceedings of NAACL . Association for Computational Linguistics, Los Angeles, California, 885--893. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Xavier Glorot, Antoine Bordes, and Yoshua Bengio. 2011. Deep Sparse Rectifier Neural Networks. In Proceedings of AISTAT. PMLR, Fort Lauderdale, FL, USA, 315--323.Google ScholarGoogle Scholar
  17. Jiatao Gu, Zhengdong Lu, Hang Li, and Victor O.K. Li. 2016. Incorporating Copying Mechanism in Sequence-to-Sequence Learning. In Proceedings of ACL. Association for Computational Linguistics, Berlin, Germany, 1631--1640.Google ScholarGoogle Scholar
  18. Caglar Gulcehre, Sungjin Ahn, Ramesh Nallapati, Bowen Zhou, and Yoshua Bengio. 2016. Pointing the Unknown Words. In Proceedings of ACL . Association for Computational Linguistics, Berlin, Germany, 140--149.Google ScholarGoogle ScholarCross RefCross Ref
  19. Shizhu He, Cao Liu, Kang Liu, and Jun Zhao. 2017. Generating Natural Answers by Incorporating Copying and Retrieving Mechanisms in Sequence-to-Sequence Learning. In Proceedings of ACL . Association for Computational Linguistics, Vancouver, Canada, 199--208.Google ScholarGoogle ScholarCross RefCross Ref
  20. Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long Short-Term Memory. Neural Comput. , Vol. 9, 8 (Nov. 1997), 1735--1780. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Hongyan Jing. 2002. Using Hidden Markov Modeling to Decompose Human-written Summaries. Comput. Linguist. , Vol. 28, 4 (Dec. 2002), 527--543. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Rudolf Kadlec, Martin Schmid, Ondvrej Bajgar, and Jan Kleindienst. 2016. Text Understanding with the Attention Sum Reader Network. In Proceedings of ACL . Association for Computational Linguistics, Berlin, Germany, 908--918.Google ScholarGoogle ScholarCross RefCross Ref
  23. Kevin Knight and Daniel Marcu. 2000. Statistics-Based Summarization - Step One: Sentence Compression. In Proceedings of AAAI . 703--710. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Chin-Yew Lin. 2004. ROUGE: A Package for Automatic Evaluation of Summaries. In Text Summarization Branches Out: Proceedings of the ACL-04 Workshop. Association for Computational Linguistics, Barcelona, Spain, 74--81.Google ScholarGoogle Scholar
  25. Wang Ling, Phil Blunsom, Edward Grefenstette, Karl Moritz Hermann, Tomávs Kovciský, Fumin Wang, and Andrew Senior. 2016. Latent Predictor Networks for Code Generation. In Proceedings of ACL . Association for Computational Linguistics, Berlin, Germany, 599--609.Google ScholarGoogle ScholarCross RefCross Ref
  26. Ryan McDonald. 2006. Discriminative Sentence Compression with Soft Syntactic Evidence. In Proceedings of EACL . 297--304.Google ScholarGoogle Scholar
  27. Stephen Merity, Caiming Xiong, James Bradbury, and Richard Socher. 2017. Pointer Sentinel Mixture Models. In Proceedings of ICLR .Google ScholarGoogle Scholar
  28. Yishu Miao and Phil Blunsom. 2016. Language as a Latent Variable: Discrete Generative Models for Sentence Compression. In Proceedings of EMNLP. Association for Computational Linguistics, Austin, Texas, 319--328.Google ScholarGoogle ScholarCross RefCross Ref
  29. Rada Mihalcea and Paul Tarau. 2004. TextRank: Bringing Order into Texts. In Proceedings of EMNLP 2004 . Association for Computational Linguistics, Barcelona, Spain, 404--411.Google ScholarGoogle Scholar
  30. Lili Mou, Yiping Song, Rui Yan, Ge Li, Lu Zhang, and Zhi Jin. 2016. Sequence to Backward and Forward Sequences: A Content-Introducing Approach to Generative Short-Text Conversation. In Proceedings of COLING. Osaka, Japan, 3349--3358.Google ScholarGoogle Scholar
  31. Ramesh Nallapati, Bowen Zhou, C'i cero Nogueira dos Santos, cC aglar Gü lcc ehre, and Bing Xiang. 2016. Abstractive Text Summarization using Sequence-to-sequence RNNs and Beyond. In Proceedings of CoNLL. Berlin, Germany, 280--290.Google ScholarGoogle Scholar
  32. Courtney Napoles, Chris Callison-Burch, Juri Ganitkevitch, and Benjamin Van Durme. 2011. Paraphrastic Sentence Compression with a Character-based Metric: Tightening without Deletion. In Proceedings of the Workshop on Monolingual Text-To-Text Generation. Association for Computational Linguistics, Portland, Oregon, 84--90. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Paul Over, Hoa Dang, and Donna Harman. 2007. DUC in Context. Inf. Process. Manage. , Vol. 43, 6 (Nov. 2007), 1506--1520. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. 2002. Bleu: a Method for Automatic Evaluation of Machine Translation. In Proceedings of ACL . Association for Computational Linguistics, Philadelphia, Pennsylvania, USA, 311--318. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Razvan Pascanu, Tomas Mikolov, and Yoshua Bengio. 2013. On the difficulty of training recurrent neural networks. In Proceedings of ICML . PMLR, Atlanta, Georgia, USA, 1310--1318. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Romain Paulus, Caiming Xiong, and Richard Socher. 2018. A Deep Reinforced Model for Abstractive Summarization. In Proceedings of ICLR .Google ScholarGoogle Scholar
  37. Alexander M. Rush, Sumit Chopra, and Jason Weston. 2015. A Neural Attention Model for Abstractive Sentence Summarization. In Proceedings of EMNLP . Association for Computational Linguistics, Lisbon, Portugal, 379--389.Google ScholarGoogle ScholarCross RefCross Ref
  38. Abigail See, Peter J. Liu, and Christopher D. Manning. 2017. Get To The Point: Summarization with Pointer-Generator Networks. In Proceedings of ACL. Association for Computational Linguistics, Vancouver, Canada, 1073--1083.Google ScholarGoogle Scholar
  39. Ilya Sutskever, Oriol Vinyals, and Quoc V. Le. 2014. Sequence to Sequence Learning with Neural Networks. In Proceedings of NIPS . Curran Associates, Inc., 3104--3112. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Jiwei Tan, Xiaojun Wan, and Jianguo Xiao. 2017. Abstractive Document Summarization with a Graph-Based Attentional Neural Model. In Proceedings of ACL . Association for Computational Linguistics, Vancouver, Canada, 1171--1181.Google ScholarGoogle ScholarCross RefCross Ref
  41. Oriol Vinyals, Meire Fortunato, and Navdeep Jaitly. 2015. Pointer Networks. In Proceedings of NIPS , , C. Cortes, N. D. Lawrence, D. D. Lee, M. Sugiyama, and R. Garnett (Eds.). Curran Associates, Inc., 2692--2700. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Jingang Wang, Junfeng Tian, Long Qiu, Sheng Li, Jun Lang, Luo Si, and Man Lan. 2018. A Multi-task Learning Approach for Improving Product Title Compression with User Search Log Data. In Proceedings of AAAI .Google ScholarGoogle Scholar
  43. Shuohang Wang and Jing Jiang. 2017. Machine Comprehension Using Match-LSTM and Answer Pointer. In Proceedings of ICLR .Google ScholarGoogle Scholar
  44. Wenhui Wang, Nan Yang, Furu Wei, Baobao Chang, and Ming Zhou. 2017. Gated Self-Matching Networks for Reading Comprehension and Question Answering. In Proceedings of ACL . Association for Computational Linguistics, Vancouver, Canada, 189--198.Google ScholarGoogle ScholarCross RefCross Ref
  45. Kristian Woodsend, Yansong Feng, and Mirella Lapata. 2010. Title Generation with Quasi-Synchronous Grammar. In Proceedings of EMNLP . Association for Computational Linguistics, Cambridge, MA, 513--523. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Sander Wubben, Antal van den Bosch, and Emiel Krahmer. 2012. Sentence Simplification by Monolingual Machine Translation. In Proceedings of ACL . Association for Computational Linguistics, Jeju Island, Korea, 1015--1024. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Chen Xing, Wei Wu, Yu Wu, Jie Liu, Yalou Huang, Ming Zhou, and Wei-Ying Ma. 2017. Topic Aware Neural Response Generation. In Proceedings of AAAI. 3351--3357.Google ScholarGoogle Scholar
  48. Lili Yao, Yaoyuan Zhang, Yansong Feng, Dongyan Zhao, and Rui Yan. 2017. Towards Implicit Content-Introducing for Generative Short-Text Conversation Systems. In Proceedings of EMNLP. Association for Computational Linguistics, Copenhagen, Denmark, 2190--2199.Google ScholarGoogle ScholarCross RefCross Ref
  49. David M. Zajic, Bonnie J. Dorr, and Richard M. Schwartz. 2004. BBN/UMD at DUC-2004: Topiary. In Proceedings of the HLT-NAACL 2004 Document Understanding Workshop. 112----119.Google ScholarGoogle Scholar

Index Terms

  1. Multi-Source Pointer Network for Product Title Summarization

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      CIKM '18: Proceedings of the 27th ACM International Conference on Information and Knowledge Management
      October 2018
      2362 pages
      ISBN:9781450360142
      DOI:10.1145/3269206

      Copyright © 2018 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 17 October 2018

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      CIKM '18 Paper Acceptance Rate147of826submissions,18%Overall Acceptance Rate1,861of8,427submissions,22%

      Upcoming Conference

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader