skip to main content
10.1145/1753326.1753620acmconferencesArticle/Chapter ViewAbstractPublication PageschiConference Proceedingsconference-collections
research-article

SoundNet: investigating a language composed of environmental sounds

Published: 10 April 2010 Publication History

Abstract

Auditory displays have been used in both human-machine and computer interfaces. However, the use of non-speech audio in assistive communication for people with language disabilities, or in other applications that employ visual representations, is still under-investigated. In this paper, we introduce SoundNet, a linguistic database that associates natural environmental sounds with words and concepts. A sound labeling study was carried out to verify SoundNet associations and to investigate how well the sounds evoke concepts. A second study was conducted using the verified SoundNet data to explore the power of environmental sounds to convey concepts in sentence contexts, compared with conventional icons and animations. Our results show that sounds can effectively illustrate (especially concrete) concepts and can be applied to assistive interfaces.

References

[1]
Ageless Project. http://jenett.org/ageless/. 2009.
[2]
Amazon Mechanical Turk. https://www.mturk.com. 2009.
[3]
BBC Sound Effects Library. http://www.sound--ideas.com/bbc.html. 2009.
[4]
Begault, D., Wenzel, E., Shrum, R., and Miller, Joel. A Virtual Audio Guidance and Alert System for Commercial Aircraft Operations. ICAD'96, 1996.
[5]
Blattner, M., Sumikawa, D., and Greenberg, R. Earcons and Icons: Their Structure and Common Design Principles. Human--Computer Interaction. 4(1), pp. 11--44. 1989.
[6]
Brewster, S. Using nonspeech sounds to provide navigation cues. ACM Transaction on Computer-Human Interactions. 5(3), pp. 224--259. 1998.
[7]
Clarke, S., Bellmann, A., De Ribaupierre, F., and Assal, G. Non-verbal auditory recognition in normal subjects and brain-damaged patients: Evidence for parallel processing. Neuropsychologia. 34 (6), 587--603. 1996.
[8]
Dick, F., Bussiere, J., and Saygm, A. The Effects of Linguistic Mediation on the Identification of Environmental Sounds. Newsletter of the Center for Research in Language. 14 (3). University of California, San Diego. 2002.
[9]
Fellbaum, C. WordNet: Electronic Lexical Database, A semantic network of English verbs. 1998.
[10]
FindSounds. http://www.findsounds.com/. 2008.
[11]
Freesound Project. http://www.freesound.org/. 2008.
[12]
Lingraphica. http://www.lingraphicare.com/. 2009.
[13]
Garzonis, S., Jones, S., Jay, T., and O'Neill, E. Auditory Icon and Earcon Mobile Service Notifications: Intuitiveness, Learnability, Memorability and Preferences. In Proc. CHI'09. pp. 1513--1522. 2009.
[14]
Gaver, W. The SonicFinder: An Interface That Uses Auditory Icons. Human-Computer Interaction. 4, pp. 67--94. 1989.
[15]
Gaver, W., Smith, R., and O'Shea, T. Effective Sounds in Complex Systems: The ARKola Simulation. In Proc. CHI'91. pp. 85--90. 1991.
[16]
Ma, X., Boy-Graber, J., Nikolova, S., and Cook, P. Speaking Through Pictures: Images vs. Icons. In Proc. ASSETS09. 2009.
[17]
Ma, X. and Cook, P. How Well do Visual Verbs Work in Daily Communication for Young and Old Adults? In Proc. CHI 2009, 2009.
[18]
Mayer-Johnson. http://www.dynavoxtech.com/. 2009.
[19]
Mynatt, J. Designing with Auditory Icons: How Well do We Identify Auditory Cues? In Proc. CHI'94. pp 269--270. 1994.
[20]
Patterson, R, and Milroy, R. Auditory warnings on civil aircraft: The learning and retention of warnings. MRC Applied Psychology Unit. Cambridge, England. 1980.
[21]
Saygm, A., Dick, F., Wilson, S., Dronkers, N., and Bates, E. Neural Resources for Processing Language and Environmental Sounds: Evidence from Aphasia. Brain. 126(4), 928--945. 2003
[22]
Scavone, G., Lakatos, S., Cook, P., and Harbke, C. Perceptual Spaces for Sound Effects Obtained with an Interactive Similarity Rating Program. Intl. Symposium on Musical Acoustics, Perugia, Italy. 2001.
[23]
Steele R., Weinrich M., Wertz R., Kleczewska, M., and Carlson, G. Computer-based Visual Communication in Aphasia. Neuropsychologia. 27(4). pp. 409--426. 1989.
[24]
Takasaki, T. PictNet: Semantic Infrastructure for Pictogram Communication. In Proc. Global WordNet Conference 2006. pp. 279--284. 2006
[25]
Tzanetakis, G. and Cook, P. Musical Genre Classification of Audio Signals. In Proc. IEEE Transaction of Speech and Audio Processing. 10 (5), 293--302. IEEE Press, 2002.
[26]
UWA Psychology. MRC Psycholinguistic Database. http://www.psy.uwa.edu.au/mrcdatabase/uwa_mrc.htm. 2009.
[27]
Van Hell, J. and De Groot, A. Conceptual Representation in Bilingual Memory: Effects of Concreteness and Cognate Status in Word Association. Bilingualism, 1(3),193--211. 1998.
[28]
Visuri, P. J. Multi-variate alarm handling and display. In Proc. the International Meeting on Thermal Nuclear Reactor Safety. National Technical Information Service. 1983.

Cited By

View all
  • (2022)Grouping and Determining Perceived Severity of Cyber-Attack Consequences: Gaining Information Needed to Sonify Cyber-AttacksJournal on Multimodal User Interfaces10.1007/s12193-022-00397-z16:4(399-412)Online publication date: 1-Nov-2022
  • (2019)FamilyLog: Monitoring Family Mealtime Activities by Mobile DevicesIEEE Transactions on Mobile Computing10.1109/TMC.2019.2916357(1-1)Online publication date: 2019
  • (2018)Perceptual Evaluation of Synthesized Sound EffectsACM Transactions on Applied Perception10.1145/316528715:2(1-19)Online publication date: 10-Apr-2018
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
CHI '10: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
April 2010
2690 pages
ISBN:9781605589299
DOI:10.1145/1753326
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 10 April 2010

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. assistive technologies
  2. environmental sound
  3. soundnet

Qualifiers

  • Research-article

Conference

CHI '10
Sponsor:

Acceptance Rates

Overall Acceptance Rate 6,199 of 26,314 submissions, 24%

Upcoming Conference

CHI 2025
ACM CHI Conference on Human Factors in Computing Systems
April 26 - May 1, 2025
Yokohama , Japan

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)11
  • Downloads (Last 6 weeks)0
Reflects downloads up to 20 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2022)Grouping and Determining Perceived Severity of Cyber-Attack Consequences: Gaining Information Needed to Sonify Cyber-AttacksJournal on Multimodal User Interfaces10.1007/s12193-022-00397-z16:4(399-412)Online publication date: 1-Nov-2022
  • (2019)FamilyLog: Monitoring Family Mealtime Activities by Mobile DevicesIEEE Transactions on Mobile Computing10.1109/TMC.2019.2916357(1-1)Online publication date: 2019
  • (2018)Perceptual Evaluation of Synthesized Sound EffectsACM Transactions on Applied Perception10.1145/316528715:2(1-19)Online publication date: 10-Apr-2018
  • (2017)FamilyLog: A mobile system for monitoring family mealtime activities2017 IEEE International Conference on Pervasive Computing and Communications (PerCom)10.1109/PERCOM.2017.7917847(21-30)Online publication date: Mar-2017
  • (2014)The Role of Sound Source Perception in Gestural Sound DescriptionACM Transactions on Applied Perception10.1145/253681111:1(1-19)Online publication date: 1-Apr-2014
  • (2010)A multimodal vocabulary for augmentative and alternative communication from sound/image label datasetsProceedings of the NAACL HLT 2010 Workshop on Speech and Language Processing for Assistive Technologies10.5555/1867750.1867758(62-70)Online publication date: 5-Jun-2010

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media