ABSTRACT
With the increased usage of internet based services and the mass of digital content now available online, the organisation of such content has become a major topic of interest both commercially and within academic research. The addition of emotional understanding for the content is a relevant parameter not only for music classification within digital libraries but also for improving users experiences, via services including automated music recommendation. Despite the singing voice being well-known for the natural communication of emotion, it is still unclear which specific musical characteristics of this signal are involved such affective expressions. The presented study investigates which musical parameters of singing relate to the emotional content, by evaluating the perception of emotion in electronically manipulated a cappella audio samples. A group of 24 individuals participated in a perception test evaluating the emotional dimensions of arousal and valence of 104 sung instances. Key results presented indicate that the rhythmic-melodic contour is potentially related to the perception of arousal whereas musical syntax and tempo can alter the perception of valence.
- L-L. Balkwill, W. F. Thompson, and R. Matsunaga. 2004. Recognition of emotion in Japanese, Western, and Hindustani music by Japanese listeners. Japanese Psychological Research 46, 4 (2004), 337--349. Google ScholarCross Ref
- R. O. Benenzon. 1991. Teoría de la musicoterapia: Aportes al conocimiento del contexto no-verbal. Editorial Mandala, Madrid, Spain.Google Scholar
- P. V. Bohlman. 1988. The study of folk music in the modern world. Indiana University Press, Bloomington, IN, USA.Google Scholar
- D. Buxton. 1983. Rock music, the star-system and the rise of consumerism. Telos 1983, 57 (1983), 93--106. Google ScholarCross Ref
- M.T. Cicero, J.M.May, and J. Wisse. 2001. Cicero: On the ideal orator (De Oratore). Oxford University Press, Oxford, UK. Google ScholarCross Ref
- G. Comotti. 1991. Music in Greek and Roman culture. Johns Hopkins University Press, Baltimore, MD, USA.Google Scholar
- E. Coutinho, J. Deng, and B. W. Schuller. 2014. Transfer learning emotion manifestation across music and speech. In International Joint Conference on Neural Networks (IJCNN). IEEE, Beijing, P.R. China, 3592--3598. Google ScholarCross Ref
- R. Cowie, C. Cox, J-C. Martin, A. Batliner, D. Heylen, and K. Karpouzis. 2011. Issues in data labelling. In Emotion-oriented systems: The humaine handbook, P. Petta, C. Pelachaud, and R. Cowie (Eds.). Springer, Berlin, Germany, 213--241. Google ScholarCross Ref
- R. Cowie, E. Douglas-Cowie, and C. Cox. 2005. Beyond emotion archetypes: Databases for emotion modelling using neural networks. Neural Networks 18 (2005), 371--388. Google ScholarDigital Library
- R. Cowie, E. Douglas-Cowie, S. Savvidou, E. McMahon, M. Sawey, and M. Schröder. 2000. FEELTRACE: An instrument for recording perceived emotion in real time. In Proceedings of the Tutorial and Research Workshop (ITRW) on Speech and Emotion. ISCA, Newcastle, UK, 19--24.Google Scholar
- R. Daido, S. Hahm, M. Ito, S. Makino, and A. Ito. 2011. A System for evaluating singing enthusiasm for karaoke. In Proceedings of the 12th International Conference on Music Information Retrieval (ISMIR). ISMIR, Miami, FL, USA, 31--36.Google Scholar
- L. Devillers, L. Vidrascu, and L. Lamel. 2005. Challenges in real-life emotion annotation and machine learning based detection. Neural Networks 18 (2005), 407--422. Google ScholarDigital Library
- H. Egermann, N. Fernando, L. Chuen, and S. McAdams. 2014. Music induces universal emotion-related psychophysiological responses: comparing Canadian listeners to Congolese Pygmies. Frontiers in psychology 5 (2014), 1--9.Google Scholar
- P. Ekman. 1984. Expression and the nature of emotion. Approaches to Emotion 3 (1984), 19--344.Google Scholar
- F. Eyben, G. L. Salomão, J. Sundberg, K. R. Scherer, and B. W. Schuller. 2015. Emotion in the singing voice - a deeper look at acoustic features in the light of automatic classification. EURASIP Journal on Audio, Speech, and Music Processing 1 (2015), 1--9.Google Scholar
- A. H. Fischer. 1993. Sex differences in emotionality: Fact or stereotype? Feminism & Psychology 3 (1993), 303--318. Google ScholarCross Ref
- S. Hantke, F. Eyben, T. Appel, and B. W. Schuller. 2015. iHEARu-PLAY: Introducing a game for crowdsourced data collection for affective computing. In Proceedings of the 1st International Workshop on Automatic Sentiment Analysis in the Wild (WASA) held in conjunction with the 6th biannual Conference on Affective Computing and Intelligent Interaction (ACII). IEEE, Xi'an, P.R. China, 891--897.Google Scholar
- X. Hu and J. S. Downie. 2007. Exploring mood metadata: Relationships with genre, artist and Usage metadata. In Proceedings of the 8th International Conference on Music Information Retrieval (ISMIR). ISMIR, Vienna, Austria, 67--72.Google Scholar
- X. Hu, J. S. Downie, C. Laurier, M. Bay, and A. F. Ehmann. 2008. The 2007 MIREX audio mood classification task: Lessons learned. In Proceedings of the 9th International Conference on Music Information Retrieval (ISMIR). ISMIR, Philadelphia, PA, USA, 462--467.Google Scholar
- X. Hu and J. H. Lee. 2016. Towards global music digital libraries: A cross-cultural comparison on the mood of Chinese music. Journal of Documentation 72, 5 (2016), 858--877. Google ScholarCross Ref
- G. Ilie and W. F. Thompson. 2006. A comparison of acoustic cues in music and speech for three dimensions of affect. Music Perception: An Interdisciplinary Journal 23 (2006), 319--330. Google ScholarCross Ref
- S. Jansens, G. Bloothooft, and G. de Krom. 1997. Perception and acoustics of emotions in singing. In Proceedings of the European Conference on Speech Communication and Technology (EUROSPEECH). ISCA, Rhodes, Greece, 2155--2158.Google Scholar
- T. Johnstone and K. R. Scherer. 2000. Vocal communication of emotion. In Handbook of emotion, M. Lewis and J. M. Haviland-Jones (Eds.). Vol. 2. Guilford, New York, NY, USA, 220--235.Google Scholar
- P. N. Juslin and P. Laukka. 2003. Communication of emotions in vocal expression and music performance: Different channels, same code? Psychological Bulletin 129 (2003), 770. Google ScholarCross Ref
- P. N. Juslin and P. Laukka. 2004. Expression, perception, and induction of musical emotions: A review and a questionnaire study of everyday listening. Journal of New Music Research 33 (2004), 217--238. Google ScholarCross Ref
- M. Kazuma. 2009. The influence of the meaning of lyrics on the expressed emotion of music valence. Systematic Musicology (2009), 53--58.Google Scholar
- K. Kosta, Y. Song, G. Fazekas, and M. B. Sandler. 2013. A Study of Cultural Dependence of Perceived Mood in Greek Music. In Proceedings of the 14th International Conference on Music Information Retrieval (ISMIR). ISMIR, Curitiba, PR, Brazil, 317--322.Google Scholar
- J. H. Lee, T. Hill, and L. Work. 2012. What does music mood mean for real users?. In Proceedings of the 2012 iConference. ACM, Toronto, ON, Canada, 112--119. Google ScholarDigital Library
- S. R. Livingstone, K. Peck, and F. A. Russo. 2013. Acoustic differences in the speaking and singing voice. Proceedings of Meetings on Acoustics 19, 1 (2013), 035080.Google Scholar
- V P. Morozov. 1996. Emotional expressiveness of the Singing Voice: The role of macrostructural and microstructural modifications of spectra. Logopedics Phoniatrics Vocology 21 (1996), 49--58. Google ScholarCross Ref
- I. R. Murray and J. L. Arnott. 1993. Toward the simulation of emotion in synthetic speech: A review of the literature on human vocal emotion. The Journal of the Acoustical Society of America 93 (1993), 1097--1108. Google ScholarCross Ref
- B. Nettl. 1974. Thoughts on improvisation: A comparative approach. The Musical Quarterly 60, 1 (1974), 1--19. Google ScholarCross Ref
- A. G. Piotrowska. 2016. The place of Russian music on the multicultural map of Europe. Muzikologija 21 (2016), 109--122. Google ScholarCross Ref
- E. Rapoport. 1996. Emotional expression code in opera and lied singing. Journal of New Music Research 25 (1996), 109--149. Google ScholarCross Ref
- B. H. Repp. 1995. Quantitative effects of global tempo on expressive timing in music performance: Some perceptual evidence. Music Perception: An Interdisciplinary Journal 13 (1995), 39--57. Google ScholarCross Ref
- F. Ringeval, A. Sonderegger, J. Sauer, and D. Lalanne. 2013. Introducing the RECOLA multimodal corpus of remote collaborative and affective interactions. In Proceedings of the 10th International Conference and Workshops on Automatic Face and Gesture Recognition (FG). IEEE, Shanghai, P.R. China, 1--8. Google ScholarCross Ref
- J. A. Russell. 1980. A circumplex model of affect. Journal of Personality and Social Psychology 39 (1980), 1161--1178. Google ScholarCross Ref
- K. Ryu and S. S. Jang. 2007. The effect of environmental perceptions on behavioral intentions through emotions: The case of upscale restaurants. Journal of Hospitality & Tourism Research 31, 1 (2007), 56--72. Google ScholarCross Ref
- K. R. Scherer. 2003. Vocal communication of emotion: A review of research paradigms. Speech Communication 40 (2003), 227--256. Google ScholarDigital Library
- K. R. Scherer, S. Feldstein, R. N. Bond, and R. Rosenthal. 1985. Vocal cues to deception: A comparative channel approach. Journal of Psycholinguistic Research 14 (1985), 409--425. Google ScholarCross Ref
- K. R. Scherer, D. R. Ladd, and K. E. Silverman. 1984. Vocal cues to speaker affect: Testing two models. Journal of Language and Social Psychology 5 (1984), 1346--1356. Google ScholarCross Ref
- K. R. Scherer, J. Sundberg, L. Tamarit, and G. L. Salomão. 2015. Comparing the acoustic expression of emotion in the speaking and the singing voice. Computer Speech & Language 29 (2015), 218--235. Google ScholarCross Ref
- E. Schubert. 1999. Measuring emotion continuously: Validity and reliability of the two-dimensional emotion-space. Australian Journal of Psychology 51 (1999), 154--165. Google ScholarCross Ref
- B. W. Schuller, J. Dorfner, and G. Rigoll. 2010. Determination of nonprototypical valence and arousal in popular music: features and performances. EURASIP Journal on Audio, Speech, and Music Processing 1 (2010), 1--19. Google ScholarDigital Library
- B. W. Schuller, F. Weninger, and J. Dorfner. 2011. Multi-Modal Non-Prototypical Music Mood Analysis in Continuous Space: Reliability and Performances.. In Proceedings of the 12th International Conference on Music Information Retrieval (ISMIR). ISMIR, Miami, FL, USA, 759--764.Google Scholar
- X. Serra. 2011. A multicultural approach in music information research. In Proceedings of the 12th International Conference on Music Information Retrieval (ISMIR). ISMIR, Miami, FL, USA, 151--156.Google Scholar
- H. Siegwart and K. R. Scherer. 1995. Acoustic concomitants of emotional expression in operatic singing: The case of Lucia in Ardi gli incensi. Journal of Voice 9 (1995), 249--260. Google ScholarCross Ref
- A. Singhi and D. G. Brown. 2014. On Cultural, Textual and Experiential Aspects of Music Mood.. In Proceedings of the 15th International Conference on Music Information Retrieval (ISMIR). ISMIR, Taipei, Taiwan, 3--8.Google Scholar
- H. Spencer. 2015. The origin and function of music. In The routledge reader on the sociology of music, J. Shepherd and K. Devine (Eds.). Routledge, London, UK, 210--238.Google Scholar
- J. C. Stemple and E. R. Hapner. 2014. Voice therapy: Clinical case studies. Plural Publishing, San Diego, CA, USA.Google Scholar
- C.J. Stevens. 2012. Music perception and cognition: A review of recent cross-cultural research. Topics in Cognitive Science 4, 4 (2012), 653--667. Google Scholar
- G. M. Sullivan and R. Feinn. 2012. Using effect size - or why the p value is not enough. Journal of Graduate Medical Education 4 (2012), 279--282. Google Scholar
- J. Sundberg, J. Iwarsson, and H. Hagegård. 1995. A singer's expression of emotions in sung performance. In Vocal fold physiology: Voice quality control, O. Fujimura (Ed.). Singular Pub. Group, San Diego, CA, USA, 217--229.Google Scholar
- H. H. Touma. 1971. The maqam phenomenon: An improvisation technique in the music of the Middle East. Ethnomusicology 15, 1 (1971), 38--48. Google ScholarCross Ref
- C. Ware. 1998. Basics of vocal pedagogy: The foundations and process of singing. McGraw-Hill, Boston, MA, USA.Google Scholar
- R. L. Wasserstein and N. A. Lazar. 2016. The ASA's statement on p-values: Context, process, and purpose. The American Statistician 70 (2016), 129--133. Google ScholarCross Ref
- F. Weninger, F. Eyben, and B. W. Schuller. 2014. On-line continuous-time music mood regression with deep recurrent neural networks. In International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, Florence, Italy, 5412--5416. Google ScholarCross Ref
- F. Weninger, F. Eyben, B. W. Schuller, M. Mortillaro, and K. R. Scherer. 2013. On the acoustics of emotion in audio: What speech, music and sound have in common. Frontiers in Psychology, section Emotion Science, Special Issue on Expression of emotion in music and vocal communication 4 (2013), 1--12.Google Scholar
- M. Zentner, D. Grandjean, and K. R. Scherer. 2008. Emotions evoked by the sound of music: Characterization, classification, and measurement. Emotion 8 (2008), 494--521. Google Scholar
Index Terms
- The Perception of Emotion in the Singing Voice: The Understanding of Music Mood for Music Organisation
Recommendations
Creating an A Cappella Singing Audio Dataset for Automatic Jingju Singing Evaluation Research
DLfM '17: Proceedings of the 4th International Workshop on Digital Libraries for MusicologyThe data-driven computational research on automatic jingju (also known as Beijing or Peking opera) singing evaluation lacks a suitable and comprehensive a cappella singing audio dataset. In this work, we present an a cappella singing audio dataset which ...
Automatic mood detection and tracking of music audio signals
Music mood describes the inherent emotional expression of a music clip. It is helpful in music understanding, music retrieval, and some other music-related applications. In this paper, a hierarchical framework is presented to automate the task of mood ...
Machine learning model for mapping of music mood and human emotion based on physiological signals
AbstractEmotion is considered a physiological state that appears whenever a transformation is observed by an individual in their environment or body. While studying the literature, it has been observed that combining the electrical activity of the brain, ...
Comments