skip to main content
10.1145/3206025.3206074acmconferencesArticle/Chapter ViewAbstractPublication PagesicmrConference Proceedingsconference-collections
research-article

Feature Selection and Multimodal Fusion for Estimating Emotions Evoked by Movie Clips

Published:05 June 2018Publication History

ABSTRACT

Perceptual understanding of media content has many applications, including content-based retrieval, marketing, content optimization, psychological assessment, and affect-based learning. In this paper, we model audio visual features extracted from videos via machine learning approaches to estimate the affective responses of the viewers. We use the LIRIS-ACCEDE dataset and the MediaEval 2017 Challenge setting to evaluate the proposed methods. This dataset is composed of movies of professional or amateur origin, annotated with viewers' arousal, valence, and fear scores. We extract a number of audio features, such as Mel-frequency Cepstral Coefficients, and visual features, such as dense SIFT, hue-saturation histogram, and features from a deep neural network trained for object recognition. We contrast two different approaches in the paper, and report experiments with different fusion and smoothing strategies. We demonstrate the benefit of feature selection and multimodal fusion on estimating affective responses to movie segments.

References

  1. Tadas Baltruvsaitis, Marwa Mahmoud, and Peter Robinson. 2015. Cross-dataset learning and person-specific normalisation for automatic action unit detection. In IEEE FG.Google ScholarGoogle Scholar
  2. Yoann Baveye, Emmanuel Dellandréa, Christel Chamaret, and Liming Chen. 2015 a. Deep learning vs. kernel methods: Performance for emotion prediction in videos Proc. ACII. IEEE, 77--83. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Yoann Baveye, Emmanuel Dellandrea, Christel Chamaret, and Liming Chen. 2015 b. LIRIS-ACCEDE: A video database for affective content analysis. IEEE Transactions on Affective Computing Vol. 6, 1 (2015), pages43--55.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Anna Bosch, Andrew Zisserman, and Xavier Munoz. 2007. Image classification using random forests and ferns ICCV.Google ScholarGoogle Scholar
  5. KL Brunick, JE Cutting, and JE DeLong. 2013. Low-level features of film: What they are and why we would be lost without them. Psychocinematics: Exploring cognition at the movies (2013), 133--148.Google ScholarGoogle Scholar
  6. Luca Canini, Sergio Benini, and Riccardo Leonardi. 2013 a. Affective recommendation of movies based on selected connotative features. Circuits and Systems for Video Technology, IEEE Transactions on Vol. 23, 4 (2013), 636--647. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Luca Canini, Sergio Benini, and Riccardo Leonardi. 2013 b. Classifying cinematographic shot types. Multimedia tools and applications Vol. 62, 1 (2013), pages51--73.Google ScholarGoogle Scholar
  8. James E Cutting, Kaitlin L Brunick, Jordan E DeLong, Catalina Iricinschi, and Ayse Candan. 2011. Quicker, faster, darker: Changes in Hollywood film over 75 years. i-Perception Vol. 2, 6 (2011), 569--576.Google ScholarGoogle Scholar
  9. Ritendra Datta, Dhiraj Joshi, Jia Li, and James Z Wang. 2006. Studying aesthetics in photographic images using a computational approach. In ECCV. 288--301. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Emmanuel Dellandréa, Liming Chen, Yoann Baveye, Mats Viktor Sjöberg, Christel Chamaret, et almbox.. 2016. The mediaeval 2016 emotional impact of movies task MediaEval 2016 Multimedia Benchmark Workshop Working Notes.Google ScholarGoogle Scholar
  11. Florian Eyben and Björn Schuller. 2015. openSMILE: the Munich open-source large-scale multimedia feature extractor. Proc. ACM Multimedia Vol. 6, 4 (2015), 4--13. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Weiming Hu, Nianhua Xie, Li Li, Xianglin Zeng, and Stephen Maybank. 2011. A survey on visual content-based video indexing and retrieval. IEEE Trans. Systems, Man, and Cybernetics, Part C Vol. 41, 6 (2011), pages797--819. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Guang-Bin Huang, Hongming Zhou, Xiaojian Ding, and Rui Zhang. 2012. Extreme learning machine for regression and multiclass classification. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics) Vol. 42, 2 (2012), 513--529. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Zitong Jin, Yuqi Yao, Ye Ma, and Mingxing Xu. 2017. THUHCSI in MediaEval 2017 Emotional Impact of Movies Task Proc. MediaEval.Google ScholarGoogle Scholar
  15. Nihan Karslioglu, Yasemin Timar, Albert Ali Salah, and Heysem Kaya. 2017. BOUN-NKU in MediaEval 2017 Emotional Impact of Movies Task Proc. MediaEval.Google ScholarGoogle Scholar
  16. Heysem Kaya, Tuugcce Özkaptan, Albert Ali Salah, and Fikret Gürgen. 2015. Random discriminative projection based feature selection with application to conflict recognition. IEEE Signal Processing Letters Vol. 22, 6 (2015), bibinfopages671--675.Google ScholarGoogle ScholarCross RefCross Ref
  17. Davis E King. 2009. Dlib-ml: A machine learning toolkit. Journal of Machine Learning Research Vol. 10, Jul (2009), pages 1755--1758. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Vu Lam, Sang Phan Le, Duy-Dinh Le, Shin'ichi Satoh, and Duc Anh Duong. 2015. NII-UIT at MediaEval 2015 Affective Impact of Movies Task Proc. MediaEval.Google ScholarGoogle Scholar
  19. Michael S Lew, Nicu Sebe, Chabane Djeraba, and Ramesh Jain. 2006 Content-based multimedia information retrieval: State of the art and challenges. ACM TOMCCAP Vol. 2, 1 (2006), 1--19. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Weisi Lin and C-C Jay Kuo. 2011. Perceptual visual quality metrics: A survey. Journal of Visual Communication and Image Representation Vol. 22, 4 (2011), 297--312. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Florent Perronnin and Christopher Dance. 2007. Fisher kernels on visual vocabularies for image categorization CVPR.Google ScholarGoogle Scholar
  22. James A Russell. 1980. A circumplex model of affect. Journal of personality and social psychology Vol. 39, 6 (1980), 1161--1178.Google ScholarGoogle ScholarCross RefCross Ref
  23. Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014).Google ScholarGoogle Scholar
  24. Mats Sjöberg, Yoann Baveye, Hanli Wang, Vu Lam Quang, Bogdan Ionescu, Emmanuel Dellandréa, Markus Schedl, Claire-Hélène Demarty, and Liming Chen. 2015. The MediaEval 2015 Affective Impact of Movies Task. Proc. MediaEval.Google ScholarGoogle Scholar
  25. Greg M Smith. 1999. Local emotions, global moods, and film structure. Passionate views: Film, cognition, andemotion (1999), 103--26.Google ScholarGoogle Scholar
  26. Greg M Smith. 2003. Film structure and the emotion system. Cambridge University Press.Google ScholarGoogle Scholar
  27. A. Vedaldi and K. Lenc. 2015. MatConvNet -- Convolutional Neural Networks for MATLAB Proceeding of the ACM Int. Conf. on Multimedia. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Peter R Winters. 1960. Forecasting sales by exponentially weighted moving averages. Management science Vol. 6, 3 (1960), 324--342. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Yun Yi, Hanli Wang, Bowen Zhang, and Jian Yu. 2015. MIC-TJU in MediaEval 2015 Affective Impact of Movies Task Proc. MediaEval.Google ScholarGoogle Scholar
  30. Wei Zeng, Wen Gao, and Debin Zhao. 2002. Video indexing by motion activity maps. In Proc. ICIP, Vol. Vol. 1. pages I--912.Google ScholarGoogle Scholar
  31. Zhihong Zeng, Maja Pantic, Glenn I Roisman, and Thomas S Huang. 2009. A survey of affect recognition methods: Audio, visual, and spontaneous expressions. IEEE transactions on pattern analysis and machine intelligence Vol. 31, 1 (2009), 39--58. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Zoran Zivkovic and Ferdinand van der Heijden. 2006. Efficient adaptive density estimation per image pixel for the task of background subtraction. Pattern recognition lettersVol. 27, 7 (2006), pages 773--780. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Weiwei Zong, Guang-Bin Huang, and Yiqiang Chen. 2013. Weighted extreme learning machine for imbalance learning. Neurocomputing Vol. 101(2013), 229--242. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Feature Selection and Multimodal Fusion for Estimating Emotions Evoked by Movie Clips

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          ICMR '18: Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval
          June 2018
          550 pages
          ISBN:9781450350464
          DOI:10.1145/3206025

          Copyright © 2018 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 5 June 2018

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          ICMR '18 Paper Acceptance Rate44of136submissions,32%Overall Acceptance Rate254of830submissions,31%

          Upcoming Conference

          ICMR '24
          International Conference on Multimedia Retrieval
          June 10 - 14, 2024
          Phuket , Thailand

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader