skip to main content
10.1145/3041021.3054146acmotherconferencesArticle/Chapter ViewAbstractPublication PageswwwConference Proceedingsconference-collections
research-article

Post Summarization of Microblogs of Sporting Events

Authors Info & Claims
Published:03 April 2017Publication History

ABSTRACT

Every day 645 million Twitter users generate approximately 58 million tweets. This motivates the question if it is possible to generate a summary of events from this rich set of tweets only. Key challenges in post summarization from microblog posts include circumnavigating spam and conversational posts. In this study, we present a novel technique called lexi-temporal clustering (LTC), which identifies key events. LTC uses k-means clustering and we explore the use of various distance measures for clustering using Euclidean, cosine similarity and Manhattan distance. We collected three original data sets consisting of Twitter microblog posts covering sporting events, consisting of a cricket and two football matches. The match summaries generated by LTC were compared against standard summaries taken from sports sections of various news outlets, which yielded up to 81% precision, 58% recall and 62% F-measure on different data sets. In addition, we also report results of all three variants of the recall-oriented understudy for gisting evaluation (ROUGE) software, a tool which compares and scores automatically generated summaries against standard summaries.

References

  1. G. Beverungen and J. Kalita. Evaluating methods for summarizing twitter posts. In Proceedings of International AAAI Conference on Web and Social Media (ICWSM), 11:9--12, 2011.Google ScholarGoogle Scholar
  2. S. Bird, E. Klein, and E. Loper. Natural Language Processing with Python. O'Reilly Media Inc., 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. D. Chakrabarti and K. Punera. Event Summarization Using Tweets. In International Conference on Weblogs and Social Media (ICWSM), 2011.Google ScholarGoogle Scholar
  4. M. Chaput. stemming 1.0: Python package index. https://pypi. python. org/pypi/stemming/1. 0, 2017.Google ScholarGoogle Scholar
  5. eMarketer. Worldwide Social Network Users: 2013 Forecast and Comparative Estimates. Technical report, eMarketer, 2013.Google ScholarGoogle Scholar
  6. G. Erkan and D. R. Radev. Lexrank: Graph-based lexical centrality as salience in text summarization. Journal of Artificial Intelligence Research, 22:457--479, 2004. Google ScholarGoogle ScholarCross RefCross Ref
  7. ESPN. ESPN Commentary. In http://goo.gl/UHpQBO, {accessed Jan-2016}.Google ScholarGoogle Scholar
  8. ESPNcricinfo. Indian Premier League - Final, Kolkata Knight Riders vs Chennai Super Kings, Scorecard. In http://goo.gl/vTpi3l, {accessed Jan-2016}.Google ScholarGoogle Scholar
  9. R. Halvorsen. Simple Twitter Streaming API access, tweetstream 1.1.1, https://pypi.python.org/pypi/tweetstream. Technical report, Pyhthon.org, 2011.Google ScholarGoogle Scholar
  10. Y. Hu, A. John, D. D. Seligmann, and F. Wang. What Were the Tweets About? Topical Associations between Public Events and Twitter Feeds. In Intern. Conf. on Weblogs and Social Media, 2012.Google ScholarGoogle Scholar
  11. K. Inc. Klout|be known for what you love. https://klout.com/, 2015.Google ScholarGoogle Scholar
  12. Indiatoday. IPL 2012 Final Live: scores and commentary. In http: //goo.gl/UIhIkR, {accessed Jan-2016}.Google ScholarGoogle Scholar
  13. D. Inouye and J. K. Kalita. Comparing Twitter Summarization Algorithms for Multiple Post Summaries. In Third IEEE International Conference on Social Computing (SocialCom), pages 298--306, October 2011. Google ScholarGoogle ScholarCross RefCross Ref
  14. R. Kelly. Twitter Study Reveals Interesting Results About Usage, 40% is Pointless Babble. http://goo.gl/DZea6f, 2009.Google ScholarGoogle Scholar
  15. M. A. H. Khan, D. Bollegala, G. Liu, and K. Sezaki. Multi-tweet summarization of real-time events. In Social Computing (SocialCom), 2013 International Conference on, pages 128--133. IEEE, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. K. Lerman and R. Ghosh. Information contagion: An empirical study of the spread of news on digg and twitter social networks. International Conference on Weblogs and Social Media, 10:90--97, 2010.Google ScholarGoogle Scholar
  17. C.-Y. Lin. Rouge: A package for automatic evaluation of summaries. In Text Summarization Branches Out: Proceedings of the ACL-04 Workshop, pages 74--81, 2004.Google ScholarGoogle Scholar
  18. A. Marcus, M. S. Bernstein, O. Badar, D. R. Karger, S. Madden, and R. C. Miller. Twitinfo: Aggregating and Visualizing Microblogs for Eevent Exploration. In Proceedings of the SIGCHI conference on Human factors in computing systems, pages 227--236, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. R. Mihalcea and P. Tarau. Textrank: Bringing order into texts. In Proceedings of Conference on Empirical Methods on Natural Language Processing (EMNLP), volume 4, page 275. Barcelona, Spain, 2004.Google ScholarGoogle Scholar
  20. J. Nichols, J. Mahmud, and C. Drews. Summarizing sporting events using twitter. In Proceedings of the 2012 ACM international conference on Intelligent User Interfaces, pages 189--198. ACM, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. B. O'Connor, M. Krieger, and D. Ahn. TweetMotif: Exploratory Search and Topic Summarization for Twitter. In International AAAI Conference on Web and Social Media (ICWSM), 2010.Google ScholarGoogle Scholar
  22. D. A. Shamma, L. Kennedy, and E. F. Churchill. Tweet the debates: Understanding community annotation of uncollected sources. In Proceedings of the first SIGMM workshop on Social media, pages 3--10, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. B. P. Sharifi. Automatic microblog classification and summarization. Doctoral dissertation, University of Colorado at Colorado Springs, 2010, 2010.Google ScholarGoogle Scholar
  24. B. P. Sharifi, M. A. Hutton, and J. Kalita. Summarizing Microblogs Automatically. In Human Language Technologies, pages 685--688. Association for Computational Linguistics, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. B. P. Sharifi, M. A. Hutton, and J. K. Kalita. Experiments in Microblog Summarization. In IEEE International Conference on Social Computing, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. A. Singhal. Modern information retrieval: A brief overview. IEEE Data Eng. Bull., 24(4):35--43, 2001.Google ScholarGoogle Scholar
  27. Skysports. European Championships Commentary. In http://goo.gl/ Wk3mR6, {accessed Jan-2016}.Google ScholarGoogle Scholar
  28. Skysports. UEFA Champions League Commentary. In http://goo.gl/Df1NQo, {accessed Jan-2016}.Google ScholarGoogle Scholar
  29. K. Tao, F. Abel, C. Hauff, G. Houben, and U. Gadiraju. Groundhog Day: Near-Duplicate Detection on Twitter. In Proceedings of the international conference on World Wide Web, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Twitter. Twitter Statistics. Technical report, available at www.statisticbrain.com/twitter-statistics/,Online; accessed Jan-2016.Google ScholarGoogle Scholar
  31. UEFAchampionsLeague. UCL 2012 Final Post-Match Commentary. In http://goo.gl/LWift2, {accessed Jan-2016}Google ScholarGoogle Scholar

Index Terms

  1. Post Summarization of Microblogs of Sporting Events

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Other conferences
      WWW '17 Companion: Proceedings of the 26th International Conference on World Wide Web Companion
      April 2017
      1738 pages
      ISBN:9781450349147

      Publisher

      International World Wide Web Conferences Steering Committee

      Republic and Canton of Geneva, Switzerland

      Publication History

      • Published: 3 April 2017

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      WWW '17 Companion Paper Acceptance Rate164of966submissions,17%Overall Acceptance Rate1,899of8,196submissions,23%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader