research-article

Evaluation of Four Approaches for "Sentiment Analysis on Movie Reviews": The Kaggle Competition

Authors:
Athanasia Koumpouri

Dept. of Computer Engineering and Informatics, University of Patras, 26500-Rion, Greece, 0030 2610997534

Dept. of Computer Engineering and Informatics, University of Patras, 26500-Rion, Greece, 0030 2610997534
View Profile

,
Iosif Mporas

Dept. of Computer Engineering and Informatics, University of Patras, 26500-Rion, Greece, 0030 2610997534

Dept. of Computer Engineering and Informatics, University of Patras, 26500-Rion, Greece, 0030 2610997534
View Profile

,
Vasileios Megalooikonomou

Dept. of Computer Engineering and Informatics, University of Patras, 26500-Rion, Greece, 0030 2610997534

Dept. of Computer Engineering and Informatics, University of Patras, 26500-Rion, Greece, 0030 2610997534
View Profile

EANN '15: Proceedings of the 16th International Conference on Engineering Applications of Neural Networks (INNS)September 2015Article No.: 23Pages 1–5https://doi.org/10.1145/2797143.2797182

Published:25 September 2015Publication History

EANN '15: Proceedings of the 16th International Conference on Engineering Applications of Neural Networks (INNS)

Pages 1–5

ABSTRACT

In this paper we present four different approaches for automatic sentiment classification on movie reviews. The proposed approaches: (a) statistical based, (b) bag-of-words based, (c) content based and (d) lexicon based approach, were evaluated in the "Sentiment Analysis on Movie Reviews" Kaggle competition. The competition results showed that each of the first three aforementioned approaches achieved approximately 65% sentiment classification accuracy, while the latter performed poorly compared to the others. The combination scheme of (b)-(c)-(d) proved to be our best performing set up achieving our best classification accuracy 67.931%, which ranked us in the 6th position among 861 participants.

References

Cambria, E., Schuller, B., Xia, Y., Havasi, C.: New avenues in opinion mining and sentiment analysis. IEEE Intelligent Systems (2) (March 2013) 15--21. Google ScholarDigital Library
Pang, B., Lee, L.: Seeing stars: Exploiting class relationships for sentiment categorization with respect to rating scales. In Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics. ACL '05, Stroudsburg, PA, USA, Association for Computational Linguistics (2005) 115--124 Google ScholarDigital Library
Pang, B., Lee, L., Vaithyanathan, S.: Thumbs up?: Sentiment classification using machine learning techniques. In Proceedings of the ACL-02 Conference on Empirical Methods in Natural Language Processing - Volume 10. EMNLP '02, Stroudsburg, PA, USA, Association for Computational Linguistics (2002) 79--86. Google ScholarDigital Library
Pak A. and Paroubek P.: Twitter as a corpus for sentiment analysis and opinion mining. In Proceedings of the Seventh Conference on International Language Resources and Evaluation, 2010:1320--1326.Google Scholar
Socher, R., Perelygin, A., Wu, J., Chuang, J., Manning, D.C., Ng, A., Potts, C.: Recursive deep models for semantic compositionality over a sentiment treebank. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics (2013) 1631--1642.Google Scholar
Turney, P.D.: Thumbs up or thumbs down?: Semantic orientation applied to unsupervised classification of reviews. In Proceedings of the 40th Annual Meeting on Association for Computational Linguistics. ACL '02, Stroudsburg, PA, USA, Association for Computational Linguistics (2002) 417--424. Google ScholarDigital Library
Popescu, A.M. and Etzioni, O.: Extracting product features and opinions from reviews. In Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing. HLT '05, Stroudsburg, PA, USA, Association for Computational Linguistics (2005) 339--346. Google ScholarDigital Library
Dasgupta, S. and Ng, V.: Mine the easy, classify the hard: A semi- supervised approach to automatic sentiment classification. In Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2. ACL '09, Stroudsburg, PA, USA, Association for Computational Linguistics (2009) 701--709. Google ScholarDigital Library
Goldberg, A.B. and Zhu, X.: Seeing stars when there aren't many stars: Graph-based semi-supervised learning for sentiment categorization. In Proceedings of the First Workshop on Graph Based Methods for Natural Language Processing. TextGraphs-1, Stroudsburg, PA, USA, Association for Computational Linguistics (2006) 45--52. Google ScholarDigital Library
Zhai, Z., Liu, B., Xu, H. and Jia P.: Grouping product features using semi-supervised learning with soft-constraints. In Proceedings of International Conference on Computational Linguistics (COLING- 2010), 2010. Google ScholarDigital Library
Zhu J., Zhu M., Wang H., Tsou B.K.: Aspect-based sentence segmentation for sentiment summarization. In Proceeding of the International CIKM Workshop on Topic- Sentiment Analysis for Mass Opinion Measurement, ACM, New York, NY, USA, TSA '09, pp 65--72. Google ScholarDigital Library
Hu, M. and Liu, B.: Mining and summarizing customer reviews. In Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. KDD '04, New York, NY, USA, ACM (2004) 168--177. Google ScholarDigital Library
Kim, S. M. and Hovy, E.: Determining the sentiment of opinions, In Proceedings of the 20th International Conference on Computational Linguistics, COLING '04, Association for Computational Linguistics, Stroudsburg, PA, USA, 2004. Google ScholarDigital Library
Hatzivassiloglou, V. and McKeown, K. R.: Predicting the semantic orientation of adjectives. In Proceedings of the eighth conference on European chapter of the Association for Computational Linguistics (EACL '97). Association for Computational Linguistics, Stroudsburg, PA, USA, 174--181. Google ScholarDigital Library
Miller, G.A.: Wordnet: A lexical database for english. Commun. ACM (11) (November 1995) 39--41 Google ScholarDigital Library
Stone, P.: The general inquirer: a computer approach to content analysis. Journal of Regional Science (1) (1968)Google Scholar
Esuli, A., Sebastiani, F.: Sentiwordnet: A publicly available lexical resource for opinion mining. In Proceedings of the 5th Conference on Language Resources and Evaluation (LREC06). (2006) 417--422Google Scholar
https://www.kaggle.com/c/sentiment-analysis-on-movie reviewsGoogle Scholar
Klein, D., Manning, C.D.: Accurate unlexicalized parsing. In Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1. ACL '03, Stroudsburg, PA, USA, Association for Computational Linguistics (2003) 423--430 Google ScholarDigital Library
Breiman, L.: Random forests. Mach. Learn. (1) (October 2001) 5--32 Google ScholarDigital Library
Duda, R. O., Hart, P. E., Stork, D. G.: Pattern Classification (2NdEdition). Wiley-Interscience (2000) Google ScholarDigital Library
http://scikit-learn.org/stable/Google Scholar

Index Terms

Evaluation of Four Approaches for "Sentiment Analysis on Movie Reviews": The Kaggle Competition
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
      1. Language resources

Recommendations

Joint sentiment/topic model for sentiment analysis
CIKM '09: Proceedings of the 18th ACM conference on Information and knowledge management

Sentiment analysis or opinion mining aims to use automated tools to detect subjective information such as opinions, attitudes, and feelings expressed in text. This paper proposes a novel probabilistic modeling framework based on Latent Dirichlet ...
Read More
Exploring weakly supervised latent sentiment explanations for aspect-level review analysis
CIKM '13: Proceedings of the 22nd ACM international conference on Information & Knowledge Management

In sentiment analysis, aspect-level review analysis has been an important task because it can catalogue, aggregate, or summarize various opinions according to a product's properties. In this paper, we explore a new concept for aspect-level review ...
Read More
Sentence Subjectivity Analysis in Social Domains
WI-IAT '13: Proceedings of the 2013 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT) - Volume 01

Subjectivity analysis recognizes the contextual polarity of opinions, attitudes, emotions, feelings etc. regarding products, services, topics, or issues. Subjectivity classification categorizes the given text as subjective or objective. While an ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
EANN '15: Proceedings of the 16th International Conference on Engineering Applications of Neural Networks (INNS)
September 2015
266 pages
ISBN:9781450335805
DOI:10.1145/2797143
Editors:
Lazaros Iliadis
Democritus University of Thrace, Greece
,
Chrisina Jane
Coventry University, UK
Copyright © 2015 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 25 September 2015
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Opinion mining
sentiment analysis
sentiment classification
sentiment polarity
Qualifiers
- research-article
- Research
- Refereed limited
Conference

Acceptance Rates
EANN '15 Paper Acceptance Rate36of60submissions,60%Overall Acceptance Rate36of60submissions,60%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 13
  Total Citations
  View Citations
- 577
  Total Downloads
- Downloads (Last 12 months)30
- Downloads (Last 6 weeks)4
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Evaluation of Four Approaches for "Sentiment Analysis on Movie Reviews": The Kaggle Competition

EANN '15: Proceedings of the 16th International Conference on Engineering Applications of Neural Networks (INNS)

ABSTRACT

References

Cited By

Index Terms

Recommendations

Joint sentiment/topic model for sentiment analysis

Exploring weakly supervised latent sentiment explanations for aspect-level review analysis

Sentence Subjectivity Analysis in Social Domains

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Evaluation of Four Approaches for "Sentiment Analysis on Movie Reviews": The Kaggle Competition

EANN '15: Proceedings of the 16th International Conference on Engineering Applications of Neural Networks (INNS)

ABSTRACT

References

Cited By

Index Terms

Recommendations

Joint sentiment/topic model for sentiment analysis

Exploring weakly supervised latent sentiment explanations for aspect-level review analysis

Sentence Subjectivity Analysis in Social Domains

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media