research-article

Perceptual Evaluation of Synthesized Sound Effects

Authors:
David Moffat

Queen Mary University of London, Godward Square, London

Queen Mary University of London, Godward Square, London
View Profile

,
Joshua D. Reiss

Queen Mary University of London, Godward Square, London

Queen Mary University of London, Godward Square, London
View Profile

Authors Info & Claims

ACM Transactions on Applied Perception Volume 15 Issue 2Article No.: 13pp 1–19https://doi.org/10.1145/3165287

Published:10 April 2018Publication History

ACM Transactions on Applied Perception

Abstract

Sound synthesis is the process of generating artificial sounds through some form of simulation or modelling. This article aims to identify which sound synthesis methods achieve the goal of producing a believable audio sample that may replace a recorded sound sample. A perceptual evaluation experiment of five different sound synthesis techniques was undertaken. Additive synthesis, statistical modelling synthesis with two different feature sets, physically inspired synthesis, concatenative synthesis, and sinusoidal modelling synthesis were all compared. Evaluation using eight different sound class stimuli and 66 different samples was undertaken. The additive synthesizer is the only synthesis method not considered significantly different from the reference sample across all sounds classes. The results demonstrate that sound synthesis can be considered as realistic as a recorded sample and makes recommendations for use of synthesis methods, given different sound class contexts.

References

Xavier Amatriain, Jordi Bonada, Alex Loscos, and Xavier Serra. 2002. Spectral processing. In DAFx: Digital Audio Effects, Udo Zölzer (Ed.). John Wiley and Sons, Ltd., Chichester, UK, Chapter 10, 373--438.Google Scholar
Mitsuko Aramaki, Richard Kronland-Martinet, and Sølvi Ystad. 2012. Perceptual control of environmental sound synthesis. In Speech, Sound and Music Processing: Embracing Research in India. Springer, Berlin, 172--186. Google ScholarDigital Library
James A. Ballas. 1993. Common factors in the identification of an assortment of brief everyday sounds. J. Exp. Psychol. Hum. Percept. Perf. 19, 2 (1993), 250.Google ScholarCross Ref
Stefan Bilbao. 2009. Numerical Sound Synthesis: Finite Difference Schemes and Simulations in Musical Acoustics. Wiley Online Library. Google ScholarDigital Library
Stefan Bilbao and John Chick. 2013. Finite difference time domain simulation for the brass instrument bore. J. Acoust. Soc. Am. 134, 5 (2013), 3860--3871.Google ScholarCross Ref
Dmitry Bogdanov, Nicolas Wack, Emilia Gómez, Sankalp Gulati, Perfecto Herrera, Oscar Mayor, Gerard Roma, Justin Salamon, José R. Zapata, and Xavier Serra. 2013. Essentia: An audio analysis library for music information retrieval. In Proceedings of the Conference of the International Society for Music Information Retrieval (ISMIR’13). 493--498.Google Scholar
Terri L. Bonebright, Nadine E. Miner, Timothy E. Goldsmith, and Thomas P. Caudell. 2005. Data collection and analysis techniques for evaluating the perceptual qualities of auditory stimuli. ACM Trans. Appl. Percept. 2, 4 (2005), 505--516. Google ScholarDigital Library
Niels Böttcher, Héctor P. Martínez, and Stefania Serafin. 2013. Procedural audio in computer games using motion controllers: An evaluation on the effect and perception. International Journal of Computer Games Technology 2013 (2013), Article ID 371374, 16 pages. Google ScholarDigital Library
Niels Böttcher and Stefania Serafin. 2009. Design and evaluation of physically inspired models of sound effects in computer games. In Proceedings of the 35th International Conference of the Audio Engineering Society Conference: Audio for Games. AES, London.Google Scholar
B. Caramiaux, F. Bevilacqua, T. Bianco, N. Schnell, O. Houix, and P. Susini. 2014. The role of sound source perception in gestural sound description. ACM Trans. Appl. Percept. 11, 1 (Apr. 2014), 1:1--1:19. Google ScholarDigital Library
Perry R. Cook. 2007. Real sound synthesis for interactive applications. Google ScholarDigital Library
Andy Farnell. 2010. Designing Sound. MIT Press Cambridge, UK. Google ScholarDigital Library
Martin Fröjd and Andrew Horner. 2009. Sound texture synthesis using an overlap--add/granular synthesis approach. J. Audio Eng. Soc. 57, 1/2 (2009), 29--37.Google Scholar
Leonardo Gabrielli, Stefano Squartini, and Vesa Välimäki. 2011. A subjective validation method for musical instrument emulation. In Proceedings of the 131st Audio Engineering Society Convention.Google Scholar
Henrik Hahn. 2015. Expressive Sampling Synthesis-Learning Extended Source--Filter Models from Instrument Sound Databases for Expressive Sample Manipulations. Ph.D. Dissertation. UPMC Université Paris VI.Google Scholar
Brahim Hamadicharef and Emmanuel Ifeachor. 2003. Objective prediction of sound synthesis quality. In Proceedings of the 115th Audio Engineering Society Convention.Google Scholar
Brahim Hamadicharef and Emmanuel Ifeachor. 2005. Perceptual modeling of piano tones. In Proceedings of the Audio Engineering Society Convention 119.Google Scholar
Christian Heinrichs and Andrew McPherson. 2014. Mapping and interaction strategies for performing environmental sound. In Proceedings of the 1st Workshop on Sonic Interactions for Virtual Environments at IEEE VR 2014.Google ScholarCross Ref
Sebastian Heise, Michael Hlatky, and Jörn Loviscach. 2009. Automatic cloning of recorded sounds by software synthesizers. In Proceedings of the Audio Engineering Society Convention 127. AES, New York, NY.Google Scholar
Simon Hendry and Joshua D. Reiss. 2010. Physical modeling and synthesis of motor noise for replication of a sound effects library. In Proceedings of the Audio Engineering Society Convention 129.Google Scholar
Matthew D. Hoffman and Perry R. Cook. 2006a. Feature-based synthesis: A tool for evaluating, designing, and interacting with music IR systems. In Proceedings of the International Symposium on Music Information Retrieval (ISMIR’06). 361--362.Google Scholar
Matthew D. Hoffman and Perry R. Cook. 2006b. Feature-based synthesis: Mapping acoustic and perceptual features onto synthesis parameters. In Proceedings of the International Computer Music Conference (ICMC’06).Google Scholar
Andrew Horner and Simon Wun. 2006. Evaluation of iterative matching for scalable wavetable synthesis. In Proceedings of the 29th International Conference of the Audio Engineering Society : Audio for Mobile and Handheld Devices.Google Scholar
ITU-R BS.1387-1. 1998. BS. 1387, Method for Objective Measurements of Perceived Audio Quality. Technical Report. ITU-R.Google Scholar
ITU-R BS.1534-3. 2015. BS. 1534, Method for Subjective Assessment of Intermediate Quality Level of Audio Systems. Technical Report. ITU-R.Google Scholar
David A. Jaffe. 1995. Ten criteria for evaluating synthesis techniques. Comput. Music J. 19, 1 (1995), 76--87.Google ScholarCross Ref
Hanna Järveläinen, Tony Verma, and Vesa Välimäki. 2002. Perception and adjustment of pitch in inharmonic string instrument tones. J. New Music Res. 31, 4 (2002), 311--319.Google ScholarCross Ref
Nicholas Jillings, Brecht De Man, David Moffat, and Joshua D. Reiss. 2015. Web audio evaluation tool: A browser-based listening test environment. In Proceedings of the Conference on Sound and Music Computing 2015.Google Scholar
Nicholas Jillings, Brecht De Man, David Moffat, and Joshua D. Reiss. 2016. Web audio evaluation tool: A framework for subjective assessment of audio. In Proceedings of the 2nd Web Audio Conference.Google Scholar
Stephen Lakatos, Stephen McAdams, and René Caussé. 1997. The representation of auditory source characteristics: Simple geometric form. Attention Percept. Psychophys. 59, 8 (1997), 1180--1190.Google ScholarCross Ref
Xiaojuan Ma, Christiane Fellbaum, and Perry R. Cook. 2010. SoundNet: Investigating a language composed of environmental sounds. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, 1945--1954. Google ScholarDigital Library
Josh H. McDermott, Andrew J. Oxenham, and Eero P. Simoncelli. 2009. Sound texture synthesis via filter statistics. In Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2009 (WASPAA’09). New Paltz, NY, 297--300.Google Scholar
Josh H. McDermott and Eero P. Simoncelli. 2011. Sound texture perception via statistics of the auditory periphery: Evidence from sound synthesis. Neuron 71, 5 (2011), 926--940.Google ScholarCross Ref
Lucas Mengual, David Moffat, and Joshua D. Reiss. 2016. Modal synthesis of weapon sounds. In Proceedings of the 61st International Conference of the Audio Engineering Society: Audio for Games. Audio Engineering Society, London.Google Scholar
Adrien Merer, Mitsuko Aramaki, Sølvi Ystad, and Richard Kronland-Martinet. 2013. Perceptual characterization of motion evoked by sounds for synthesis control purposes. ACM Trans. Appl. Percept. 10, 1 (Mar. 2013), 1--24. Google ScholarDigital Library
Adrien Merer, Sølvi Ystad, Richard Kronland-Martinet, and Mitsuko Aramaki. 2011. Abstract sounds and their applications in audio and perception research. International Symposium on Computer Music Modeling and Retrieval CMMR 2010: Exploring Music Contents (2011), 176–187. Google ScholarDigital Library
Nadine E. Miner and Thomas P. Caudell. 2005. Using wavelets to synthesize stochastic-based sounds for immersive virtual environments. ACM Trans. Appl. Percept. 2, 4 (Oct. 2005), 521--528. Google ScholarDigital Library
A. Misra and P. R. Cook. 2009. Toward synthesized environments: A survey of analysis and synthesis methods for sound designers and composers. In Proceedings of the International Computer Music Conference (ICMC’09).Google Scholar
David Moffat, David Ronan, and Joshusa D. Reiss. 2015. An evaluation of audio feature extraction toolboxes. In Proceedings of the 18th International Conference on Digital Audio Effects (DAFx’15).Google Scholar
David Moffat, David Ronan, and Joshusa D. Reiss. 2017. Unsupervised taxonomy of sound effects. In Proceedings of the 20th International Conference on Digital Audio Effects (DAFx’17).Google Scholar
Emma Murphy, Mathieu Lagrange, Gary Scavone, Philippe Depalle, and Catherine Guastavino. 2008. Perceptual evaluation of a real-time synthesis technique for rolling sounds. In Proceedings of the Conference on Enactive Interfaces. Interactive Design Foundation, Pisa, Italy.Google Scholar
Rolf Nordahl, Stefania Serafin, and Luca Turchet. 2010. Sound synthesis and evaluation of interactive footsteps for virtual reality applications. In Proceedings of the IEEE Virtual Reality Conference. IEEE, 147--153. Google ScholarDigital Library
Sean O’Leary and Axel Robel. 2014. A montage approach to sound texture synthesis. In Proceedings of the 22nd European Signal Processing Conference (EUSIPCO’14). IEEE, 939--943.Google Scholar
Juan Pampin. 2004. ATS: A system for sound analysis transformation and synthesis based on a sinusoidal plus critical-band noise model and psychoacoustics. In Proceedings of the International Computer Music Conference, Vol. 1001. 402--405.Google Scholar
Leevi Peltola, Cumhur Erkut, P. R. Cook, and Vesa Valimaki. 2007. Synthesis of hand clapping sounds. IEEE Trans. Audio Speech Lang. Process. 15, 3 (2007), 1021--1029. Google ScholarDigital Library
Vytis Puronas. 2014. Sonic hyperrealism: Illusions of a non-existent aural reality. New Soundtr. 4, 2 (2014), 181--194.Google ScholarCross Ref
Davide Rocchesso, Roberto Bresin, and Mikael Fernstrom. 2003. Sounding objects. IEEE MultiMedia 10, 2 (2003), 42--52. Google ScholarDigital Library
Davide Rocchesso and Federico Fontana. 2003. The Sounding Object. Mondo estremo.Google Scholar
G. Scavone, Stephen Lakatos, P. Cook, and Colin Harbke. 2001. Perceptual spaces for sound effects obtained with an interactive similarity rating program. In Proceedings of International Symposium on Musical Acoustics.Google Scholar
Diemo Schwarz. 2011. State of the art in sound texture synthesis. In Proceedings of the 14th International Conference Digital Audio Effects (DAFx’11). 221--231.Google Scholar
Diemo Schwarz, Axel Roebel, Hengchin Yeh, and Amaury La Burthe. 2016. Concatenative sound texture synthesis methods and evaluation. In Proceedings of the 19th International Conference on Digital Audio Effects (DAFx’16).Google Scholar
Rod Selfridge, David Moffat, Eldad J. Avital, and Joshua D. Reiss. 2017d. Creating real-time aeroacoustic sound effects using physically derived models. (Unpublished).Google Scholar
Rod Selfridge, David Moffat, and Joshua D. Reiss. 2017a. Physically derived sound synthesis model of a propeller. In Proceedings of the 12th International Audio Mostly Conference. ACM. Google ScholarDigital Library
Rod Selfridge, David Moffat, and Joshua D. Reiss. 2017b. Real-time physical model for synthesis of sword swing sounds. In Proceedings of the International Conference on Sound and Music Computing (SMC’17). Espoo, Finland.Google Scholar
Rod Selfridge, David Moffat, and Joshua D. Reiss. 2017c. Sound synthesis of objects swinging through air using physical models. Applied Sciences.Google Scholar
Rod Selfridge, David Moffat, Joshua D. Reiss, and Eldad J. Avital. 2017e. Real-time physical model for an aeolian harp. In Proceedings of the International Congress on Sound and Vibration. London, UK.Google Scholar
Xavier Serra and Julius Smith. 1990. Spectral modeling synthesis: A sound analysis/synthesis system based on a deterministic plus stochastic decomposition. Comput. Music J. 14, 4 (1990), 12--24.Google ScholarCross Ref
Thilo Thiede, William C. Treurniet, and others. 2000. PEAQ-The ITU standard for objective measurement of perceived audio quality. J. Audio Eng. Soc. 48, 1/2 (2000), 3--29.Google Scholar
Tero Tolonen, Vesa Välimäki, and Matti Karjalainen. 1998. Evaluation of Modern Sound Synthesis Methods. Technical Report. Helsinki University of Technology.Google Scholar
Charles Verron, Mitsuko Aramaki, and others. 2010. A 3D immersive synthesizer for environmental sounds. IEEE Trans. Audio. Speech Lang. Process. 18, 6 (2010), 1550--1561.Google ScholarDigital Library

Index Terms

Perceptual Evaluation of Synthesized Sound Effects

Recommendations

Toward Generating Realistic Sounds for Soft Bodies: A Review
AM '19: Proceedings of the 14th International Audio Mostly Conference: A Journey in Sound

Generating realistic sounds for soft bodies is a challenging task due to the complexity of the interactions. Therefore, automatic audio generation based on procedural approach has become an attractive method for digital synthesis of soft-body sounds. In ...
Read More
Physically-based statistical simulation of rain sound

A typical rainfall scenario contains tens of thousands of dynamic sound sources. A characteristic of the large-scale scene is the strong randomness in raindrop distribution, which makes it notoriously expensive to synthesize such sounds with purely ...
Read More
Example-guided physically based modal sound synthesis

Linear modal synthesis methods have often been used to generate sounds for rigid bodies. One of the key challenges in widely adopting such techniques is the lack of automatic determination of satisfactory material parameters that recreate realistic ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM Transactions on Applied Perception Volume 15, Issue 2
April 2018
104 pages
ISSN:1544-3558
EISSN:1544-3965
DOI:10.1145/3190502
Editors:
Victoria Interrante
University of Minnesota, USA
,
Martin Giese
University of Tübingen, Germany
Issue’s Table of Contents
Copyright © 2018 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 10 April 2018
- Accepted: 1 November 2017
- Revised: 1 October 2017
- Received: 1 July 2017
Published in tap Volume 15, Issue 2

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Sound synthesis
evaluation
perception
procedural audio
sound effects
Qualifiers
- research-article
- Research
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 6
  Total Citations
  View Citations
- 452
  Total Downloads
- Downloads (Last 12 months)53
- Downloads (Last 6 weeks)5
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Perceptual Evaluation of Synthesized Sound Effects

ACM Transactions on Applied Perception

Abstract

References

Cited By

Index Terms

Recommendations

Toward Generating Realistic Sounds for Soft Bodies: A Review

Physically-based statistical simulation of rain sound

Example-guided physically based modal sound synthesis