skip to main content
research-article

Perceptual Evaluation of Synthesized Sound Effects

Published:10 April 2018Publication History
Skip Abstract Section

Abstract

Sound synthesis is the process of generating artificial sounds through some form of simulation or modelling. This article aims to identify which sound synthesis methods achieve the goal of producing a believable audio sample that may replace a recorded sound sample. A perceptual evaluation experiment of five different sound synthesis techniques was undertaken. Additive synthesis, statistical modelling synthesis with two different feature sets, physically inspired synthesis, concatenative synthesis, and sinusoidal modelling synthesis were all compared. Evaluation using eight different sound class stimuli and 66 different samples was undertaken. The additive synthesizer is the only synthesis method not considered significantly different from the reference sample across all sounds classes. The results demonstrate that sound synthesis can be considered as realistic as a recorded sample and makes recommendations for use of synthesis methods, given different sound class contexts.

References

  1. Xavier Amatriain, Jordi Bonada, Alex Loscos, and Xavier Serra. 2002. Spectral processing. In DAFx: Digital Audio Effects, Udo Zölzer (Ed.). John Wiley and Sons, Ltd., Chichester, UK, Chapter 10, 373--438.Google ScholarGoogle Scholar
  2. Mitsuko Aramaki, Richard Kronland-Martinet, and Sølvi Ystad. 2012. Perceptual control of environmental sound synthesis. In Speech, Sound and Music Processing: Embracing Research in India. Springer, Berlin, 172--186. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. James A. Ballas. 1993. Common factors in the identification of an assortment of brief everyday sounds. J. Exp. Psychol. Hum. Percept. Perf. 19, 2 (1993), 250.Google ScholarGoogle ScholarCross RefCross Ref
  4. Stefan Bilbao. 2009. Numerical Sound Synthesis: Finite Difference Schemes and Simulations in Musical Acoustics. Wiley Online Library. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Stefan Bilbao and John Chick. 2013. Finite difference time domain simulation for the brass instrument bore. J. Acoust. Soc. Am. 134, 5 (2013), 3860--3871.Google ScholarGoogle ScholarCross RefCross Ref
  6. Dmitry Bogdanov, Nicolas Wack, Emilia Gómez, Sankalp Gulati, Perfecto Herrera, Oscar Mayor, Gerard Roma, Justin Salamon, José R. Zapata, and Xavier Serra. 2013. Essentia: An audio analysis library for music information retrieval. In Proceedings of the Conference of the International Society for Music Information Retrieval (ISMIR’13). 493--498.Google ScholarGoogle Scholar
  7. Terri L. Bonebright, Nadine E. Miner, Timothy E. Goldsmith, and Thomas P. Caudell. 2005. Data collection and analysis techniques for evaluating the perceptual qualities of auditory stimuli. ACM Trans. Appl. Percept. 2, 4 (2005), 505--516. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Niels Böttcher, Héctor P. Martínez, and Stefania Serafin. 2013. Procedural audio in computer games using motion controllers: An evaluation on the effect and perception. International Journal of Computer Games Technology 2013 (2013), Article ID 371374, 16 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Niels Böttcher and Stefania Serafin. 2009. Design and evaluation of physically inspired models of sound effects in computer games. In Proceedings of the 35th International Conference of the Audio Engineering Society Conference: Audio for Games. AES, London.Google ScholarGoogle Scholar
  10. B. Caramiaux, F. Bevilacqua, T. Bianco, N. Schnell, O. Houix, and P. Susini. 2014. The role of sound source perception in gestural sound description. ACM Trans. Appl. Percept. 11, 1 (Apr. 2014), 1:1--1:19. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Perry R. Cook. 2007. Real sound synthesis for interactive applications. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Andy Farnell. 2010. Designing Sound. MIT Press Cambridge, UK. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Martin Fröjd and Andrew Horner. 2009. Sound texture synthesis using an overlap--add/granular synthesis approach. J. Audio Eng. Soc. 57, 1/2 (2009), 29--37.Google ScholarGoogle Scholar
  14. Leonardo Gabrielli, Stefano Squartini, and Vesa Välimäki. 2011. A subjective validation method for musical instrument emulation. In Proceedings of the 131st Audio Engineering Society Convention.Google ScholarGoogle Scholar
  15. Henrik Hahn. 2015. Expressive Sampling Synthesis-Learning Extended Source--Filter Models from Instrument Sound Databases for Expressive Sample Manipulations. Ph.D. Dissertation. UPMC Université Paris VI.Google ScholarGoogle Scholar
  16. Brahim Hamadicharef and Emmanuel Ifeachor. 2003. Objective prediction of sound synthesis quality. In Proceedings of the 115th Audio Engineering Society Convention.Google ScholarGoogle Scholar
  17. Brahim Hamadicharef and Emmanuel Ifeachor. 2005. Perceptual modeling of piano tones. In Proceedings of the Audio Engineering Society Convention 119.Google ScholarGoogle Scholar
  18. Christian Heinrichs and Andrew McPherson. 2014. Mapping and interaction strategies for performing environmental sound. In Proceedings of the 1st Workshop on Sonic Interactions for Virtual Environments at IEEE VR 2014.Google ScholarGoogle ScholarCross RefCross Ref
  19. Sebastian Heise, Michael Hlatky, and Jörn Loviscach. 2009. Automatic cloning of recorded sounds by software synthesizers. In Proceedings of the Audio Engineering Society Convention 127. AES, New York, NY.Google ScholarGoogle Scholar
  20. Simon Hendry and Joshua D. Reiss. 2010. Physical modeling and synthesis of motor noise for replication of a sound effects library. In Proceedings of the Audio Engineering Society Convention 129.Google ScholarGoogle Scholar
  21. Matthew D. Hoffman and Perry R. Cook. 2006a. Feature-based synthesis: A tool for evaluating, designing, and interacting with music IR systems. In Proceedings of the International Symposium on Music Information Retrieval (ISMIR’06). 361--362.Google ScholarGoogle Scholar
  22. Matthew D. Hoffman and Perry R. Cook. 2006b. Feature-based synthesis: Mapping acoustic and perceptual features onto synthesis parameters. In Proceedings of the International Computer Music Conference (ICMC’06).Google ScholarGoogle Scholar
  23. Andrew Horner and Simon Wun. 2006. Evaluation of iterative matching for scalable wavetable synthesis. In Proceedings of the 29th International Conference of the Audio Engineering Society : Audio for Mobile and Handheld Devices.Google ScholarGoogle Scholar
  24. ITU-R BS.1387-1. 1998. BS. 1387, Method for Objective Measurements of Perceived Audio Quality. Technical Report. ITU-R.Google ScholarGoogle Scholar
  25. ITU-R BS.1534-3. 2015. BS. 1534, Method for Subjective Assessment of Intermediate Quality Level of Audio Systems. Technical Report. ITU-R.Google ScholarGoogle Scholar
  26. David A. Jaffe. 1995. Ten criteria for evaluating synthesis techniques. Comput. Music J. 19, 1 (1995), 76--87.Google ScholarGoogle ScholarCross RefCross Ref
  27. Hanna Järveläinen, Tony Verma, and Vesa Välimäki. 2002. Perception and adjustment of pitch in inharmonic string instrument tones. J. New Music Res. 31, 4 (2002), 311--319.Google ScholarGoogle ScholarCross RefCross Ref
  28. Nicholas Jillings, Brecht De Man, David Moffat, and Joshua D. Reiss. 2015. Web audio evaluation tool: A browser-based listening test environment. In Proceedings of the Conference on Sound and Music Computing 2015.Google ScholarGoogle Scholar
  29. Nicholas Jillings, Brecht De Man, David Moffat, and Joshua D. Reiss. 2016. Web audio evaluation tool: A framework for subjective assessment of audio. In Proceedings of the 2nd Web Audio Conference.Google ScholarGoogle Scholar
  30. Stephen Lakatos, Stephen McAdams, and René Caussé. 1997. The representation of auditory source characteristics: Simple geometric form. Attention Percept. Psychophys. 59, 8 (1997), 1180--1190.Google ScholarGoogle ScholarCross RefCross Ref
  31. Xiaojuan Ma, Christiane Fellbaum, and Perry R. Cook. 2010. SoundNet: Investigating a language composed of environmental sounds. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, 1945--1954. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Josh H. McDermott, Andrew J. Oxenham, and Eero P. Simoncelli. 2009. Sound texture synthesis via filter statistics. In Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2009 (WASPAA’09). New Paltz, NY, 297--300.Google ScholarGoogle Scholar
  33. Josh H. McDermott and Eero P. Simoncelli. 2011. Sound texture perception via statistics of the auditory periphery: Evidence from sound synthesis. Neuron 71, 5 (2011), 926--940.Google ScholarGoogle ScholarCross RefCross Ref
  34. Lucas Mengual, David Moffat, and Joshua D. Reiss. 2016. Modal synthesis of weapon sounds. In Proceedings of the 61st International Conference of the Audio Engineering Society: Audio for Games. Audio Engineering Society, London.Google ScholarGoogle Scholar
  35. Adrien Merer, Mitsuko Aramaki, Sølvi Ystad, and Richard Kronland-Martinet. 2013. Perceptual characterization of motion evoked by sounds for synthesis control purposes. ACM Trans. Appl. Percept. 10, 1 (Mar. 2013), 1--24. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Adrien Merer, Sølvi Ystad, Richard Kronland-Martinet, and Mitsuko Aramaki. 2011. Abstract sounds and their applications in audio and perception research. International Symposium on Computer Music Modeling and Retrieval CMMR 2010: Exploring Music Contents (2011), 176–187. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Nadine E. Miner and Thomas P. Caudell. 2005. Using wavelets to synthesize stochastic-based sounds for immersive virtual environments. ACM Trans. Appl. Percept. 2, 4 (Oct. 2005), 521--528. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. A. Misra and P. R. Cook. 2009. Toward synthesized environments: A survey of analysis and synthesis methods for sound designers and composers. In Proceedings of the International Computer Music Conference (ICMC’09).Google ScholarGoogle Scholar
  39. David Moffat, David Ronan, and Joshusa D. Reiss. 2015. An evaluation of audio feature extraction toolboxes. In Proceedings of the 18th International Conference on Digital Audio Effects (DAFx’15).Google ScholarGoogle Scholar
  40. David Moffat, David Ronan, and Joshusa D. Reiss. 2017. Unsupervised taxonomy of sound effects. In Proceedings of the 20th International Conference on Digital Audio Effects (DAFx’17).Google ScholarGoogle Scholar
  41. Emma Murphy, Mathieu Lagrange, Gary Scavone, Philippe Depalle, and Catherine Guastavino. 2008. Perceptual evaluation of a real-time synthesis technique for rolling sounds. In Proceedings of the Conference on Enactive Interfaces. Interactive Design Foundation, Pisa, Italy.Google ScholarGoogle Scholar
  42. Rolf Nordahl, Stefania Serafin, and Luca Turchet. 2010. Sound synthesis and evaluation of interactive footsteps for virtual reality applications. In Proceedings of the IEEE Virtual Reality Conference. IEEE, 147--153. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Sean O’Leary and Axel Robel. 2014. A montage approach to sound texture synthesis. In Proceedings of the 22nd European Signal Processing Conference (EUSIPCO’14). IEEE, 939--943.Google ScholarGoogle Scholar
  44. Juan Pampin. 2004. ATS: A system for sound analysis transformation and synthesis based on a sinusoidal plus critical-band noise model and psychoacoustics. In Proceedings of the International Computer Music Conference, Vol. 1001. 402--405.Google ScholarGoogle Scholar
  45. Leevi Peltola, Cumhur Erkut, P. R. Cook, and Vesa Valimaki. 2007. Synthesis of hand clapping sounds. IEEE Trans. Audio Speech Lang. Process. 15, 3 (2007), 1021--1029. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Vytis Puronas. 2014. Sonic hyperrealism: Illusions of a non-existent aural reality. New Soundtr. 4, 2 (2014), 181--194.Google ScholarGoogle ScholarCross RefCross Ref
  47. Davide Rocchesso, Roberto Bresin, and Mikael Fernstrom. 2003. Sounding objects. IEEE MultiMedia 10, 2 (2003), 42--52. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Davide Rocchesso and Federico Fontana. 2003. The Sounding Object. Mondo estremo.Google ScholarGoogle Scholar
  49. G. Scavone, Stephen Lakatos, P. Cook, and Colin Harbke. 2001. Perceptual spaces for sound effects obtained with an interactive similarity rating program. In Proceedings of International Symposium on Musical Acoustics.Google ScholarGoogle Scholar
  50. Diemo Schwarz. 2011. State of the art in sound texture synthesis. In Proceedings of the 14th International Conference Digital Audio Effects (DAFx’11). 221--231.Google ScholarGoogle Scholar
  51. Diemo Schwarz, Axel Roebel, Hengchin Yeh, and Amaury La Burthe. 2016. Concatenative sound texture synthesis methods and evaluation. In Proceedings of the 19th International Conference on Digital Audio Effects (DAFx’16).Google ScholarGoogle Scholar
  52. Rod Selfridge, David Moffat, Eldad J. Avital, and Joshua D. Reiss. 2017d. Creating real-time aeroacoustic sound effects using physically derived models. (Unpublished).Google ScholarGoogle Scholar
  53. Rod Selfridge, David Moffat, and Joshua D. Reiss. 2017a. Physically derived sound synthesis model of a propeller. In Proceedings of the 12th International Audio Mostly Conference. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. Rod Selfridge, David Moffat, and Joshua D. Reiss. 2017b. Real-time physical model for synthesis of sword swing sounds. In Proceedings of the International Conference on Sound and Music Computing (SMC’17). Espoo, Finland.Google ScholarGoogle Scholar
  55. Rod Selfridge, David Moffat, and Joshua D. Reiss. 2017c. Sound synthesis of objects swinging through air using physical models. Applied Sciences.Google ScholarGoogle Scholar
  56. Rod Selfridge, David Moffat, Joshua D. Reiss, and Eldad J. Avital. 2017e. Real-time physical model for an aeolian harp. In Proceedings of the International Congress on Sound and Vibration. London, UK.Google ScholarGoogle Scholar
  57. Xavier Serra and Julius Smith. 1990. Spectral modeling synthesis: A sound analysis/synthesis system based on a deterministic plus stochastic decomposition. Comput. Music J. 14, 4 (1990), 12--24.Google ScholarGoogle ScholarCross RefCross Ref
  58. Thilo Thiede, William C. Treurniet, and others. 2000. PEAQ-The ITU standard for objective measurement of perceived audio quality. J. Audio Eng. Soc. 48, 1/2 (2000), 3--29.Google ScholarGoogle Scholar
  59. Tero Tolonen, Vesa Välimäki, and Matti Karjalainen. 1998. Evaluation of Modern Sound Synthesis Methods. Technical Report. Helsinki University of Technology.Google ScholarGoogle Scholar
  60. Charles Verron, Mitsuko Aramaki, and others. 2010. A 3D immersive synthesizer for environmental sounds. IEEE Trans. Audio. Speech Lang. Process. 18, 6 (2010), 1550--1561.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Perceptual Evaluation of Synthesized Sound Effects

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image ACM Transactions on Applied Perception
          ACM Transactions on Applied Perception  Volume 15, Issue 2
          April 2018
          104 pages
          ISSN:1544-3558
          EISSN:1544-3965
          DOI:10.1145/3190502
          Issue’s Table of Contents

          Copyright © 2018 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 10 April 2018
          • Accepted: 1 November 2017
          • Revised: 1 October 2017
          • Received: 1 July 2017
          Published in tap Volume 15, Issue 2

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article
          • Research
          • Refereed

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader