skip to main content
10.1145/3173574.3173717acmconferencesArticle/Chapter ViewAbstractPublication PageschiConference Proceedingsconference-collections
research-article
Open access

HindSight: Enhancing Spatial Awareness by Sonifying Detected Objects in Real-Time 360-Degree Video

Published: 19 April 2018 Publication History

Abstract

Our perception of our surrounding environment is limited by the constraints of human biology. The field of augmented perception asks how our sensory capabilities can be usefully extended through computational means. We argue that spatial awareness can be enhanced by exploiting recent advances in computer vision which make high-accuracy, real-time object detection feasible in everyday settings. We introduce HindSight, a wearable system that increases spatial awareness by detecting relevant objects in live 360-degree video and sonifying their position and class through bone conduction headphones. HindSight uses a deep neural network to locate and attribute semantic information to objects surrounding a user through a head-worn panoramic camera. It then uses bone conduction headphones, which preserve natural auditory acuity, to transmit audio notifications for detected objects of interest. We develop an application using HindSight to warn cyclists of approaching vehicles outside their field of view and evaluate it in an exploratory study with 15 users. Participants reported increases in perceived safety and awareness of approaching vehicles when using HindSight.

Supplementary Material

ZIP File (pn1951-file4.zip)
suppl.mov (pn1951-file3.mp4)
Supplemental video

References

[1]
Kiomars Anvari. 2017. Helmet with wireless sensor using intelligent main shoulder pad. US Patent No. US9596901. (Mar 2017).
[2]
Jérôme Ardouin, Anatole Lécuyer, Maud Marchal, Clément Riant, and Eric Marchand. 2012. FlyVIZ: A Novel Display Device to Provide Humans with 360 Vision by Coupling Catadioptric Camera with Hmd. In Proceedings of the 18th ACM Symposium on Virtual Reality Software and Technology (VRST '12). ACM, New York, NY, USA, 41--44.
[3]
Jeffrey P. Bigham, Chandrika Jayant, Hanjie Ji, Greg Little, Andrew Miller, Robert C. Miller, Robin Miller, Aubrey Tatarowicz, Brandyn White, Samual White, and Tom Yeh. 2010. VizWiz: Nearly Real-time Answers to Visual Questions. In Proceedings of the 23Nd Annual ACM Symposium on User Interface Software and Technology (UIST '10). ACM, New York, NY, USA, 333--342.
[4]
M. Billinghurst, J. Bowskill, N. Dyer, and J. Morphett. 1998. An evaluation of wearable information spaces. In Proceedings. IEEE 1998 Virtual Reality Annual International Symposium (Cat. No.98CB36180). 20--27.
[5]
Blaze. 2017. Blaze: Innovative products for urban cycling. https://blaze.cc/. (2017).
[6]
Z. Cai, D. G. Richards, M. L. Lenhardt, and A. G. Madsen. 2002. Response of human skull to bone-conducted sound in the audiometric-ultrasonic range. Int Tinnitus J 8, 1 (2002), 3--8.
[7]
Alexandru Dancu, Velko Vechev, Adviye Ayça Ünlüer, Simon Nilson, Oscar Nygren, Simon Eliasson, Jean-Elie Barjonet, Joe Marshall, and Morten Fjeld. 2015. Gesture Bike: Examining Projection Surfaces and Turn Signal Systems for Urban Cycling. In Proceedings of the 2015 International Conference on Interactive Tabletops & Surfaces (ITS '15). ACM, New York, NY, USA, 151--159.
[8]
Frederik Diederichs and Benjamin Fischle. Advanced Telematics for Enhancing the Safety and Comfort of Motorcycle Riders: HMI Concepts and Strategies. http://www.saferider-eu.org/assets/docs/deliverables/ SAFERIDER_D5_1_HMI_Concepts_and_Strategies.pdf
[9]
M R Everingham, B T Thomas, and T Troscianko. 1999. Head-mounted mobility aid for low vision using scene classification techniques. International Journal of Human-Computer Interaction 15, 2 (1999), 231--244.
[10]
Kevin Fan, Jochen Huber, Suranga Nanayakkara, and Masahiko Inami. 2014. SpiderVision: Extending the Human Field of View for Augmented Awareness. In Proceedings of the 5th Augmented Human International Conference (AH '14). ACM, New York, NY, USA, Article 49, 8 pages.
[11]
Juan Diego Gomez, Guido Bologna, and Thierry Pun. 2012. Spatial Awareness and Intelligibility for the Blind: Audio-touch Interfaces. In CHI '12 Extended Abstracts on Human Factors in Computing Systems (CHI EA '12). ACM, New York, NY, USA, 1529--1534.
[12]
Thomas Hermann, Andy Hunt, and John G. Neuhoff. 2011. The Sonification Handbook. Logos Verlag, Berlin.
[13]
Andrew G. Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang, Tobias Weyand, Marco Andreetto, and Hartwig Adam. 2017. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications. CoRR abs/1704.04861 (2017). http://arxiv.org/abs/1704.04861
[14]
Ian P. Howard and Brian J. Rogers. 1995. Binocular Vision and Stereopsis. Oxford University Press, New York.
[15]
IPPINKA. 2013. Xfire: On-Demand Laser Bike Lane. https://www.ippinka.com/blog/ xfire-on-demand-laser-bike-lane/. (2013).
[16]
L. Kay. 1974. A sonar aid to enhance spatial perception of the blind: engineering design and evaluation. Radio and Electronic Engineer 44, 11 (November 1974), 605--627.
[17]
S. Kerber and H. Fastl. 2008. Prediction of perceptibility of vehicle exterior noise in background noise. In Tagungsband Fortschritte der Akustik (DAGA '08), U. Jekosch and R. Hoffmann (Eds.). DEGA, 623--624.
[18]
Dagmar Kern and Albrecht Schmidt. 2009. Design Space for Driver-based Automotive User Interfaces. In Proceedings of the 1st International Conference on Automotive User Interfaces and Interactive Vehicular Applications (AutomotiveUI '09). ACM, New York, NY, USA, 3--10.
[19]
Vinitha Khambadkar and Eelke Folmer. 2013. GIST: A Gestural Interface for Remote Nonvisual Spatial Perception. In Proceedings of the 26th Annual ACM Symposium on User Interface Software and Technology (UIST '13). ACM, New York, NY, USA, 301--310.
[20]
Scott R. Klemmer, Björn Hartmann, and Leila Takayama. 2006. How Bodies Matter: Five Themes for Interaction Design. In Proceedings of the 6th Conference on Designing Interactive Systems (DIS '06). ACM, New York, NY, USA, 140--149.
[21]
Wei Liu, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy, Scott E. Reed, Cheng-Yang Fu, and Alexander C. Berg. 2015. SSD: Single Shot MultiBox Detector. CoRR abs/1512.02325 (2015). http://arxiv.org/abs/1512.02325
[22]
D. Menzel, K. Yamauchi, F. Völk, and F. Fastl. 2011. Psychoacoustic experiments on feasible sound levels of possible warning signals for quiet vehicles. In Tagungsband Fortschritte der Akustik (DAGA '11). DEGA.
[23]
Nicolas Misdariis and Andrea Cera. 2013. Sound signature of Quiet Vehicles: state of the art and experience feedbacks. In Inter-Noise. Innsbruck, Austria. https://hal.archives-ouvertes.fr/hal-01106897
[24]
Nicolas Misdariis, Andrea Cera, Eugenie Levallois, and Christophe Locqueteau. 2012. Do electric cars have to make noise? An emblematic opportunity for designing sounds and soundscapes. In Acoustics 2012, Société Française d'Acoustique (Ed.). Nantes, France. https://hal.archives-ouvertes.fr/hal-00810920
[25]
Takashi Miyaki and Jun Rekimoto. 2016. LiDARMAN: Reprogramming Reality with Egocentric Laser Depth Scanning. In ACM SIGGRAPH 2016 Emerging Technologies (SIGGRAPH '16). ACM, New York, NY, USA, Article 15, 2 pages.
[26]
M. Mon-Williams, J. P. Wann, and S. Rushton. 1995. Design factors in stereoscopic virtual-reality displays. Journal of the Society for Information Display 3, 4 (1995), 207--210.
[27]
A. Mukhtar, L. Xia, and T. B. Tang. 2015. Vehicle Detection Techniques for Collision Avoidance Systems: A Review. IEEE Transactions on Intelligent Transportation Systems 16, 5 (Oct 2015), 2318--2338.
[28]
Shohei Nagai, Shunichi Kasahara, and Jun Rekimoto. 2015. LiveSphere: Sharing the Surrounding Visual Environment for Immersive Experience in Remote Collaboration. In Proceedings of the Ninth International Conference on Tangible, Embedded, and Embodied Interaction (TEI '15). ACM, New York, NY, USA, 113--116.
[29]
Paul L. Olson and Michael Sivak. 1986. Perception-Response Time to Unexpected Roadway Hazards. Human Factors 28, 1 (1986), 91--96.
[30]
Alex Olwal, Jonny Gustafsson, and Christoffer Lindfors. 2008. Spatial augmented reality on industrial CNC-machines. Proc.SPIE 6804 (2008), 6804 -- 6804 -- 9.
[31]
Etienne Parizet, Wolfgang Ellermeier, and Ryan Robart. 2014. Auditory warnings for electric vehicles: Detectability in normal-vision and visually-impaired listeners. Applied Acoustics 86, Supplement C (2014), 50--58.
[32]
R. Pea and R. Lindgren. 2008. Video Collaboratories for Research and Education: An Analysis of Collaboration Design Patterns. IEEE Transactions on Learning Technologies 1, 4 (Oct 2008), 235--247.
[33]
Jef Raskin. 2000. The Humane Interface: New Directions for Designing Interactive Systems. ACM Press/Addison-Wesley Publishing Co., New York, NY, USA.
[34]
Joseph Redmon and Ali Farhadi. 2016. YOLO9000: Better, Faster, Stronger. CoRR abs/1612.08242 (2016). http://arxiv.org/abs/1612.08242
[35]
Andreas Riener, Myounghoon Jeon, Ignacio Alvarez, and Anna K. Frison. 2017. Driver in the Loop: Best Practices in Automotive Sensing and Feedback Mechanisms. Springer International Publishing, Cham, 295--323.
[36]
Yong Rui, Anoop Gupta, and J. J. Cadiz. 2001. Viewing Meeting Captured by an Omni-directional Camera. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '01). ACM, New York, NY, USA, 450--457.
[37]
Eldon Schoop, Michelle Nguyen, Daniel Lim, Valkyrie Savage, Sean Follmer, and Björn Hartmann. 2016. Drill Sergeant: Supporting Physical Construction Projects Through an Ecosystem of Augmented Tools. In Proceedings of the 2016 CHI Conference Extended Abstracts on Human Factors in Computing Systems (CHI EA '16). ACM, New York, NY, USA, 1607--1614.
[38]
S. Sivaraman and M. M. Trivedi. 2013. Looking at Vehicles on the Road: A Survey of Vision-Based Vehicle Detection, Tracking, and Behavior Analysis. IEEE Transactions on Intelligent Transportation Systems 14, 4 (Dec 2013), 1773--1795.
[39]
John P. Snyder. 1987. Map projections: A working manual. Technical Report. Washington, D.C. http://pubs.er.usgs.gov/publication/pp1395
[40]
Hans Strasburger, Ingo Rentschler, and Martin Jüttner. 2011. Peripheral vision and pattern recognition: A review. Journal of Vision 11, 5 (2011), 13.
[41]
Yu-Chuan Su, Dinesh Jayaraman, and Kristen Grauman. 2016. Pano2Vid: Automatic Cinematography for Watching 360° Videos. CoRR abs/1612.02335 (2016). http://arxiv.org/abs/1612.02335
[42]
SKULLY Systems. 2013. SKULLY AR-1 The World's Smartest Motorcycle Helmet. https://www.indiegogo.com/projects/ skully-ar-1-the-world-s-smartest-motorcycle-helmet. (2013).
[43]
Matt Uyttendaele. 2017. Optimizing 360 photos at scale. (Aug 2017). https://code.facebook.com/posts/ 129055711052260/optimizing-360-photos-at-scale/
[44]
Oriol Vinyals, Alexander Toshev, Samy Bengio, and Dumitru Erhan. 2014. Show and Tell: A Neural Image Caption Generator. CoRR abs/1411.4555 (2014). http://arxiv.org/abs/1411.4555
[45]
H. C. Wang, R. K. Katzschmann, S. Teng, B. Araki, L. Giarré, and D. Rus. 2017. Enabling independent navigation for visually impaired people through a wearable vision-based feedback system. In 2017 IEEE International Conference on Robotics and Automation (ICRA). 6533--6540.
[46]
Michael Zöllner, Stephan Huber, Hans-Christian Jetter, and Harald Reiterer. 2011. NAVI -- A Proof-of-Concept of a Mobile Navigational Aid for Visually Impaired Based on the Microsoft Kinect. Springer Berlin Heidelberg, Berlin, Heidelberg, 584--587.
[47]
Amit Zoran and Joseph A. Paradiso. 2013. FreeD: A Freehand Digital Sculpting Tool. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '13). ACM, New York, NY, USA, 2613--2616.

Cited By

View all
  • (2024)ShoulderTapper: Providing Directional Cues through Electrotactile Feedback for Target Acquisition in Pick-by-Light SystemsProceedings of the International Conference on Mobile and Ubiquitous Multimedia10.1145/3701571.3701597(228-234)Online publication date: 1-Dec-2024
  • (2024)Hey, What's Going On?Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36596188:2(1-24)Online publication date: 15-May-2024
  • (2024)SonicVista: Towards Creating Awareness of Distant Scenes through SonificationProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36596098:2(1-32)Online publication date: 15-May-2024
  • Show More Cited By

Index Terms

  1. HindSight: Enhancing Spatial Awareness by Sonifying Detected Objects in Real-Time 360-Degree Video

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    CHI '18: Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems
    April 2018
    8489 pages
    ISBN:9781450356206
    DOI:10.1145/3173574
    Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 19 April 2018

    Check for updates

    Author Tags

    1. 360-degree video
    2. augmented perception
    3. computer vision
    4. sonification

    Qualifiers

    • Research-article

    Funding Sources

    • NSF
    • MARCO and DARPA

    Conference

    CHI '18
    Sponsor:

    Acceptance Rates

    CHI '18 Paper Acceptance Rate 666 of 2,590 submissions, 26%;
    Overall Acceptance Rate 6,199 of 26,314 submissions, 24%

    Upcoming Conference

    CHI 2025
    ACM CHI Conference on Human Factors in Computing Systems
    April 26 - May 1, 2025
    Yokohama , Japan

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)244
    • Downloads (Last 6 weeks)40
    Reflects downloads up to 22 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)ShoulderTapper: Providing Directional Cues through Electrotactile Feedback for Target Acquisition in Pick-by-Light SystemsProceedings of the International Conference on Mobile and Ubiquitous Multimedia10.1145/3701571.3701597(228-234)Online publication date: 1-Dec-2024
    • (2024)Hey, What's Going On?Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36596188:2(1-24)Online publication date: 15-May-2024
    • (2024)SonicVista: Towards Creating Awareness of Distant Scenes through SonificationProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36596098:2(1-32)Online publication date: 15-May-2024
    • (2024)ReAR Indicators: Peripheral Cycling Indicators for Rear-Approaching HazardsProceedings of the 2024 International Conference on Advanced Visual Interfaces10.1145/3656650.3656659(1-9)Online publication date: 3-Jun-2024
    • (2024)SonoHaptics: An Audio-Haptic Cursor for Gaze-Based Object Selection in XRProceedings of the 37th Annual ACM Symposium on User Interface Software and Technology10.1145/3654777.3676384(1-19)Online publication date: 13-Oct-2024
    • (2024)Synthetic Visual Sensations: Augmenting Human Spatial Awareness with a Wearable Retinal Electric Stimulation DeviceProceedings of the Augmented Humans International Conference 202410.1145/3652920.3652932(15-27)Online publication date: 4-Apr-2024
    • (2024)Grand challenges in CyclingHCIProceedings of the 2024 ACM Designing Interactive Systems Conference10.1145/3643834.3661550(2577-2590)Online publication date: 1-Jul-2024
    • (2024)A Large Vision-Language Model based Environment Perception System for Visually Impaired People2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)10.1109/IROS58592.2024.10801813(221-228)Online publication date: 14-Oct-2024
    • (2024)Cross-modal feedback of tactile and auditory stimuli for cyclists in noisy environmentsSensors and Actuators A: Physical10.1016/j.sna.2024.116031380(116031)Online publication date: Dec-2024
    • (2023)Towards Improving Spatial Orientation using Electrical Muscle Stimulation as Tactile and Force FeedbackProceedings of the 22nd International Conference on Mobile and Ubiquitous Multimedia10.1145/3626705.3631792(512-514)Online publication date: 3-Dec-2023
    • Show More Cited By

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Login options

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media