skip to main content
10.1145/2797143.2797182acmotherconferencesArticle/Chapter ViewAbstractPublication PageseannConference Proceedingsconference-collections
research-article

Evaluation of Four Approaches for "Sentiment Analysis on Movie Reviews": The Kaggle Competition

Authors Info & Claims
Published:25 September 2015Publication History

ABSTRACT

In this paper we present four different approaches for automatic sentiment classification on movie reviews. The proposed approaches: (a) statistical based, (b) bag-of-words based, (c) content based and (d) lexicon based approach, were evaluated in the "Sentiment Analysis on Movie Reviews" Kaggle competition. The competition results showed that each of the first three aforementioned approaches achieved approximately 65% sentiment classification accuracy, while the latter performed poorly compared to the others. The combination scheme of (b)-(c)-(d) proved to be our best performing set up achieving our best classification accuracy 67.931%, which ranked us in the 6th position among 861 participants.

References

  1. Cambria, E., Schuller, B., Xia, Y., Havasi, C.: New avenues in opinion mining and sentiment analysis. IEEE Intelligent Systems (2) (March 2013) 15--21. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Pang, B., Lee, L.: Seeing stars: Exploiting class relationships for sentiment categorization with respect to rating scales. In Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics. ACL '05, Stroudsburg, PA, USA, Association for Computational Linguistics (2005) 115--124 Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Pang, B., Lee, L., Vaithyanathan, S.: Thumbs up?: Sentiment classification using machine learning techniques. In Proceedings of the ACL-02 Conference on Empirical Methods in Natural Language Processing - Volume 10. EMNLP '02, Stroudsburg, PA, USA, Association for Computational Linguistics (2002) 79--86. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Pak A. and Paroubek P.: Twitter as a corpus for sentiment analysis and opinion mining. In Proceedings of the Seventh Conference on International Language Resources and Evaluation, 2010:1320--1326.Google ScholarGoogle Scholar
  5. Socher, R., Perelygin, A., Wu, J., Chuang, J., Manning, D.C., Ng, A., Potts, C.: Recursive deep models for semantic compositionality over a sentiment treebank. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics (2013) 1631--1642.Google ScholarGoogle Scholar
  6. Turney, P.D.: Thumbs up or thumbs down?: Semantic orientation applied to unsupervised classification of reviews. In Proceedings of the 40th Annual Meeting on Association for Computational Linguistics. ACL '02, Stroudsburg, PA, USA, Association for Computational Linguistics (2002) 417--424. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Popescu, A.M. and Etzioni, O.: Extracting product features and opinions from reviews. In Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing. HLT '05, Stroudsburg, PA, USA, Association for Computational Linguistics (2005) 339--346. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Dasgupta, S. and Ng, V.: Mine the easy, classify the hard: A semi- supervised approach to automatic sentiment classification. In Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2. ACL '09, Stroudsburg, PA, USA, Association for Computational Linguistics (2009) 701--709. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Goldberg, A.B. and Zhu, X.: Seeing stars when there aren't many stars: Graph-based semi-supervised learning for sentiment categorization. In Proceedings of the First Workshop on Graph Based Methods for Natural Language Processing. TextGraphs-1, Stroudsburg, PA, USA, Association for Computational Linguistics (2006) 45--52. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Zhai, Z., Liu, B., Xu, H. and Jia P.: Grouping product features using semi-supervised learning with soft-constraints. In Proceedings of International Conference on Computational Linguistics (COLING- 2010), 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Zhu J., Zhu M., Wang H., Tsou B.K.: Aspect-based sentence segmentation for sentiment summarization. In Proceeding of the International CIKM Workshop on Topic- Sentiment Analysis for Mass Opinion Measurement, ACM, New York, NY, USA, TSA '09, pp 65--72. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Hu, M. and Liu, B.: Mining and summarizing customer reviews. In Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. KDD '04, New York, NY, USA, ACM (2004) 168--177. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Kim, S. M. and Hovy, E.: Determining the sentiment of opinions, In Proceedings of the 20th International Conference on Computational Linguistics, COLING '04, Association for Computational Linguistics, Stroudsburg, PA, USA, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Hatzivassiloglou, V. and McKeown, K. R.: Predicting the semantic orientation of adjectives. In Proceedings of the eighth conference on European chapter of the Association for Computational Linguistics (EACL '97). Association for Computational Linguistics, Stroudsburg, PA, USA, 174--181. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Miller, G.A.: Wordnet: A lexical database for english. Commun. ACM (11) (November 1995) 39--41 Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Stone, P.: The general inquirer: a computer approach to content analysis. Journal of Regional Science (1) (1968)Google ScholarGoogle Scholar
  17. Esuli, A., Sebastiani, F.: Sentiwordnet: A publicly available lexical resource for opinion mining. In Proceedings of the 5th Conference on Language Resources and Evaluation (LREC06). (2006) 417--422Google ScholarGoogle Scholar
  18. https://www.kaggle.com/c/sentiment-analysis-on-movie reviewsGoogle ScholarGoogle Scholar
  19. Klein, D., Manning, C.D.: Accurate unlexicalized parsing. In Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1. ACL '03, Stroudsburg, PA, USA, Association for Computational Linguistics (2003) 423--430 Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Breiman, L.: Random forests. Mach. Learn. (1) (October 2001) 5--32 Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Duda, R. O., Hart, P. E., Stork, D. G.: Pattern Classification (2NdEdition). Wiley-Interscience (2000) Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. http://scikit-learn.org/stable/Google ScholarGoogle Scholar

Index Terms

  1. Evaluation of Four Approaches for "Sentiment Analysis on Movie Reviews": The Kaggle Competition

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Other conferences
      EANN '15: Proceedings of the 16th International Conference on Engineering Applications of Neural Networks (INNS)
      September 2015
      266 pages
      ISBN:9781450335805
      DOI:10.1145/2797143

      Copyright © 2015 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 25 September 2015

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed limited

      Acceptance Rates

      EANN '15 Paper Acceptance Rate36of60submissions,60%Overall Acceptance Rate36of60submissions,60%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader