skip to main content
10.1145/3025453.3025899acmconferencesArticle/Chapter ViewAbstractPublication PageschiConference Proceedingsconference-collections
research-article

People with Visual Impairment Training Personal Object Recognizers: Feasibility and Challenges

Published: 02 May 2017 Publication History

Abstract

Blind people often need to identify objects around them, from packages of food to items of clothing. Automatic object recognition continues to provide limited assistance in such tasks because models tend to be trained on images taken by sighted people with different background clutter, scale, viewpoints, occlusion, and image quality than in photos taken by blind users. We explore personal object recognizers, where visually impaired people train a mobile application with a few snapshots of objects of interest and provide custom labels. We adopt transfer learning with a deep learning system for user-defined multi-label k-instance classification. Experiments with blind participants demonstrate the feasibility of our approach, which reaches accuracies over 90% for some participants. We analyze user data and feedback to explore effects of sample size, photo-quality variance, and object shape; and contrast models trained on photos by blind participants to those by sighted participants and generic recognizers.

References

[1]
Tousif Ahmed, Roberto Hoyle, Kay Connelly, David Crandall, and Apu Kapadia. 2015. Privacy Concerns and Behaviors of People with Visual Impairments. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems (CHI '15). ACM, New York, NY, USA, 3523--3532.
[2]
Aipoly. 2016. Vision through artificial intelligence. (2016). http://aipoly.com/
[3]
BeMyEyes. 2016. Lend you eyes to the blind. (2016). http://www.bemyeyes.org/
[4]
BeSpecular. 2016. Let blind people see through your eyes. (2016). https://www.bespecular.com/
[5]
Erin L Brady, Yu Zhong, Meredith Ringel Morris, and Jeffrey P Bigham. 2013. Investigating the appropriateness of social network question asking as a resource for blind users. In Proceedings of the 2013 Conference on Computer Supported Cooperative Work. ACM, 1225--1236.
[6]
CamFind. 2016. Search the physical world. (2016). http://camfindapp.com/
[7]
Julien Champ, Titouan Lorieul, Maximilien Servajean, and Alexis Joly. 2015. A comparative study of fine-grained classification methods in the context of the lifeclef plant identification challenge 2015. In CLEF 2015, Vol. 1391.
[8]
Talking Goggles. 2016. A camera with speech. (2016). http://www.sparklingapps.com/goggles/
[9]
i.d. mate. 2016. Talking bar code scanners. (2016). http://www.envisionamerica.com/store
[10]
Hernisa Kacorri, Sergio Mascetti, Andrea Gerino, Dragan Ahmetovic, Hironobu Takagi, and Chieko Asakawa. 2016. Supporting Orientation of People with Visual Impairment: Analysis of Large Scale Usage Data. In 18th International ACM SIGACCESS Conference on Computers and Accessibility. ACM.
[11]
Minghuang Ma, Haoqi Fan, and Kris M Kitani. 2016. Going Deeper into First-Person Activity Recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
[12]
Opticon. 2016. Handheld Scanner. (2016). http: //www.opticonusa.com/products/handheld-solutions/
[13]
Sinno Jialin Pan and Qiang Yang. 2010. A survey on transfer learning. IEEE Transactions on knowledge and data engineering 22, 10 (2010), 1345--1359.
[14]
Novi Patricia and Barbara Caputo. 2014. Learning to learn, from transfer learning to domain adaptation: A unifying perspective. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1442--1449.
[15]
KNFB Reader. 2016. Access to print materials. (2016). http://www.knfbreader.com/
[16]
LookTel Recognizer. 2016. Instantly recognize everyday objects. (2016). http://www.looktel.com/recognizer
[17]
Larry D Rosen, Kelly Whaling, L Mark Carrier, Nancy A Cheever, and J Rokkum. 2013. The media and technology usage and attitudes scale: An empirical investigation. Computers in Human Behavior 29, 6 (2013), 2501--2511.
[18]
Andrew Rosenberg and Julia Hirschberg. 2007. V-Measure: A Conditional Entropy-Based External Cluster Evaluation Measure. In Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL'07), Vol. 7. 410--420.
[19]
Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, Alexander C. Berg, and Li Fei-Fei. 2015. ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer Vision (IJCV) 115, 3 (2015), 211--252.
[20]
Christian Szegedy, Vincent Vanhoucke, Sergey Ioffe, Jonathon Shlens, and Zbigniew Wojna. 2015. Rethinking the inception architecture for computer vision. arXiv preprint arXiv:1512.00567 (2015).
[21]
TapTapSee. 2016. Mobile camera application designed specifically for the blind and visually impaired iOS users. (2016). http://www.taptapseeapp.com/
[22]
Color Teller. 2016. The Talking Color Identifier. (2016). http://www.brytech.com/colorteller/
[23]
Nees Jan Van Eck and Ludo Waltman. 2011. Text mining and visualization using VOSviewer. ISSI Newsletter 7, 3 (2011), 50--54.
[24]
Oriol Vinyals, Charles Blundell, Timothy Lillicrap, Koray Kavukcuoglu, and Daan Wierstra. 2016. Matching Networks for One Shot Learning. In In Proceedings of the Conference on Neural Information Processing Systems (NIPS'16).
[25]
Yu Zhong, Pierre J Garrigues, and Jeffrey P Bigham. 2013. Real time object scanning using a mobile phone and cloud-based visual search engine. In Proceedings of the 15th International ACM SIGACCESS Conference on Computers and Accessibility. ACM, 20.

Cited By

View all
  • (2024)Image-to-Text Translation for Interactive Image Recognition: A Comparative User Study with Non-expert UsersJournal of Information Processing10.2197/ipsjjip.32.35832(358-368)Online publication date: 2024
  • (2024)AI-Vision: A Three-Layer Accessible Image Exploration System for People with Visual Impairments in ChinaProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36785378:3(1-27)Online publication date: 9-Sep-2024
  • (2024)Misfitting With AI: How Blind People Verify and Contest AI ErrorsProceedings of the 26th International ACM SIGACCESS Conference on Computers and Accessibility10.1145/3663548.3675659(1-17)Online publication date: 27-Oct-2024
  • Show More Cited By

Index Terms

  1. People with Visual Impairment Training Personal Object Recognizers: Feasibility and Challenges

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    CHI '17: Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems
    May 2017
    7138 pages
    ISBN:9781450346559
    DOI:10.1145/3025453
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 02 May 2017

    Permissions

    Request permissions for this article.

    Check for updates

    Badges

    • Honorable Mention

    Author Tags

    1. accessibility
    2. blind
    3. computer vision
    4. object recognition
    5. photographs

    Qualifiers

    • Research-article

    Funding Sources

    • JST CREST
    • Shimizu Corporation

    Conference

    CHI '17
    Sponsor:

    Acceptance Rates

    CHI '17 Paper Acceptance Rate 600 of 2,400 submissions, 25%;
    Overall Acceptance Rate 6,199 of 26,314 submissions, 24%

    Upcoming Conference

    CHI 2025
    ACM CHI Conference on Human Factors in Computing Systems
    April 26 - May 1, 2025
    Yokohama , Japan

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)124
    • Downloads (Last 6 weeks)11
    Reflects downloads up to 03 Mar 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Image-to-Text Translation for Interactive Image Recognition: A Comparative User Study with Non-expert UsersJournal of Information Processing10.2197/ipsjjip.32.35832(358-368)Online publication date: 2024
    • (2024)AI-Vision: A Three-Layer Accessible Image Exploration System for People with Visual Impairments in ChinaProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36785378:3(1-27)Online publication date: 9-Sep-2024
    • (2024)Misfitting With AI: How Blind People Verify and Contest AI ErrorsProceedings of the 26th International ACM SIGACCESS Conference on Computers and Accessibility10.1145/3663548.3675659(1-17)Online publication date: 27-Oct-2024
    • (2024)Understanding How Blind Users Handle Object Recognition Errors: Strategies and ChallengesProceedings of the 26th International ACM SIGACCESS Conference on Computers and Accessibility10.1145/3663548.3675635(1-15)Online publication date: 27-Oct-2024
    • (2024)AccessShare: Co-designing Data Access and Sharing with Blind PeopleProceedings of the 26th International ACM SIGACCESS Conference on Computers and Accessibility10.1145/3663548.3675612(1-16)Online publication date: 27-Oct-2024
    • (2024)Help and The Social Construction of Access: A Case-Study from IndiaProceedings of the 26th International ACM SIGACCESS Conference on Computers and Accessibility10.1145/3663548.3675606(1-12)Online publication date: 27-Oct-2024
    • (2024)SonicVista: Towards Creating Awareness of Distant Scenes through SonificationProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36596098:2(1-32)Online publication date: 15-May-2024
    • (2024)ProgramAlly: Creating Custom Visual Access Programs via Multi-Modal End-User ProgrammingProceedings of the 37th Annual ACM Symposium on User Interface Software and Technology10.1145/3654777.3676391(1-15)Online publication date: 13-Oct-2024
    • (2024)ViObjectProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36435478:1(1-26)Online publication date: 6-Mar-2024
    • (2024)Find My Things: Personalized Accessibility through Teachable AI for People who are Blind or Low VisionExtended Abstracts of the CHI Conference on Human Factors in Computing Systems10.1145/3613905.3648641(1-6)Online publication date: 11-May-2024
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media