research-article

Search Vox: leveraging multimodal refinement and partial knowledge for mobile voice search

Authors:

Bongshin LeeAuthors Info & Claims

UIST '08: Proceedings of the 21st annual ACM symposium on User interface software and technology

Pages 141 - 150

https://doi.org/10.1145/1449715.1449738

Published: 19 October 2008 Publication History

Abstract

Internet usage on mobile devices continues to grow as users seek anytime, anywhere access to information. Because users frequently search for businesses, directory assistance has been the focus of many voice search applications utilizing speech as the primary input modality. Unfortunately, mobile settings often contain noise which degrades performance. As such, we present Search Vox, a mobile search interface that not only facilitates touch and text refinement whenever speech fails, but also allows users to assist the recognizer via text hints. Search Vox can also take advantage of any partial knowledge users may have about the business listing by letting them express their uncertainty in an intuitive way using verbal wildcards. In simulation experiments conducted on real voice search data, leveraging multimodal refinement resulted in a 28% relative reduction in error rate. Providing text hints along with the spoken utterance resulted in even greater relative reduction, with dramatic gains in recovery for each additional character.

Supplementary Material

JPG File (46.jpg)

Download
11.27 KB

JPG File (p141-paek.jpg)

Download
18.39 KB

FLV File (46.flv)

Download
7.04 MB

MOV File (p141-paek.mov)

Download
31.94 MB

References

[1]

Ainsworth, W. A. & Pratt, S. R. 1992. Feedback strategies for error correction in speech recognition systems. International Journal of Man-Machine Studies, 26(6), 833--842.

Digital Library

[2]

Church, K., Thiesson, B., & Ragno, R. 2007. K-best suffix arrays. Proc. of NAACL HLT, companion volume, 17--20.

Digital Library

[3]

Hsu, P., Mahajan, M. & Acero, A. 2005. Multimodal text entry on mobile devices. Proc. of ASRU.

[4]

Ipsos Insight. 2006. Mobile phones could soon to rival the PC as world's dominant Internet platform. http://www.ipsosna.com/news/pressrelease.cfm?id=3049, April 2006. Accessed January 2008.

[5]

Jelinek, F. 1998. Statistical methods for speech recognition. MIT Press.

Digital Library

[6]

Levenshtein, V. I. 1966. Binary codes capable of correcting deletions, insertions, and reversals. Soviet Physics Doklady 10:707--710.

[7]

Live Search Mobile: http://livesearchmobile.com/

[8]

Manber, U. & Myers, G. 1990. Suffix Arrays: A New Method for On-line String Searches, Proc. of SODA, 319--327.

Digital Library

[9]

Oviatt, S. & Van Gent, R. 1994. Error resolution during multimodal human-computer interaction. In Proc. of CHI, 415--422.

[10]

Oviatt, S. 1999. Mutual disambiguation of recognition errors in a multimodal architecture. In Proc. of the International Conference on Computer-Human Interaction, 576--583.

Digital Library

[11]

Oviatt, S. 2000. Taming recognition errors with a multimodal interface. Communications of the ACM, 43(9), 45--51.

Digital Library

[12]

Oviatt, S. 2000. Multimodal system processing in mobile environments. Proc. of UIST, 21--29.

Digital Library

[13]

Paek, T. & Ju, Y.C. 2008. Accommodating explicit user expressions of uncertainty in voice search or something like that. Proc. of Interspeech.

[14]

Rhyne, J. R. & Wolf, C. G. 1993. Recognition-based user interfaces. In Advances in Human-Computer Interaction, H. R. Hartson & D. Hix, Eds. Ablex Publishing Corp, 191--212.

[15]

Salton, G. 1983. Introduction to modern information retrieval. McGraw-Hill.

Digital Library

[16]

Suhm, B., Myers, B. & Waibel, A. 2001. Multimodal error correction for speech user interfaces. ACM TOCHI, 8(1), 60--98.

Digital Library

[17]

Tellme Press Release. 2006. Tellme to power all Cingular wireless 411 calls: Expanded relationship focuses on enhancing 411 with personalization and mobile search services, http://www.tellme.com/about/PressRoom/release/20061009, October 2006. Accessed March 2008.

[18]

Yahoo oneSearch: http://mobile.yahoo.com/onesearch

[19]

Yu, D., Ju, Y. C., Wang, Y. Y., Zweig, G., & Acero, A. 2007. Automated directory assistance system: From theory to practice. Proc. of Interspeech.

Cited By

Tu JLin GStarner T(2020)Towards an Understanding of Real-time Captioning on Head-worn Displays22nd International Conference on Human-Computer Interaction with Mobile Devices and Services10.1145/3406324.3410543(1-5)Online publication date: 5-Oct-2020
https://dl.acm.org/doi/10.1145/3406324.3410543
Fraser CMarkel JBasa NDontcheva MKlemmer SIqbal SMacLean KChevalier FMueller S(2020)ReMapProceedings of the 33rd Annual ACM Symposium on User Interface Software and Technology10.1145/3379337.3415592(979-986)Online publication date: 20-Oct-2020
https://dl.acm.org/doi/10.1145/3379337.3415592
LaViola Jr. JBuchanan SPittman C(2014)Multimodal Input for Perceptual User InterfacesInteractive Displays10.1002/9781118706237.ch9(285-312)Online publication date: 12-Jul-2014
https://doi.org/10.1002/9781118706237.ch9
Show More Cited By

Index Terms

Search Vox: leveraging multimodal refinement and partial knowledge for mobile voice search
1. Human-centered computing
  1. Human computer interaction (HCI)
    1. Interaction devices
      1. Touch screens
    2. Interaction paradigms
      1. Graphical user interfaces

Recommendations

Business-Based SMS Mobile Search
AINAW '08: Proceedings of the 22nd International Conference on Advanced Information Networking and Applications - Workshops

Comparing to WAP-based mobile search, SMS-based (Short Message Service based) mobile search needs to obtain exact search results. In order to improve SMS-based mobile search usability, the Business-based SMS mobile search is proposed. All businesses are ...
Investigating collaborative mobile search behaviors
MobileHCI '13: Proceedings of the 15th international conference on Human-computer interaction with mobile devices and services

People use mobile devices to search, locate and discover local information around them. Mobile local search is frequently a social activity. This paper presents the results of a survey and an exploratory user study of collaborative mobile local search. ...
An In-Situ Study of Mobile App & Mobile Search Interactions
CHI '15: Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems

When trying to satisfy an information need, smartphone users frequently transition from mobile search engines to mobile apps and vice versa. However, little is known about the nature of these transitions nor how mobile search and mobile apps interact. ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

UIST '08: Proceedings of the 21st annual ACM symposium on User interface software and technology

October 2008

308 pages

ISBN:9781595939753

DOI:10.1145/1449715

General Chair:
Steve Cousins
Willow Garage, USA
,
Program Chair:
Michel Beaudouin-Lafon
Université Paris-Sud, France

Copyright © 2008 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 19 October 2008

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

UIST08

Sponsor:

UIST08: The 21st Annual ACM Symposium on User Interface Software and Technology

October 19 - 22, 2008

CA, Monterey, USA

Acceptance Rates

Overall Acceptance Rate 561 of 2,567 submissions, 22%

Upcoming Conference

UIST '25

Sponsor:
sigchi
sigchi

The 38th Annual ACM Symposium on User Interface Software and Technology

September 28 - October 1, 2025

Busan , Republic of Korea

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

11
Total Citations
View Citations
555
Total Downloads

Downloads (Last 12 months)4
Downloads (Last 6 weeks)0

Reflects downloads up to 09 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Tu JLin GStarner T(2020)Towards an Understanding of Real-time Captioning on Head-worn Displays22nd International Conference on Human-Computer Interaction with Mobile Devices and Services10.1145/3406324.3410543(1-5)Online publication date: 5-Oct-2020
https://dl.acm.org/doi/10.1145/3406324.3410543
Fraser CMarkel JBasa NDontcheva MKlemmer SIqbal SMacLean KChevalier FMueller S(2020)ReMapProceedings of the 33rd Annual ACM Symposium on User Interface Software and Technology10.1145/3379337.3415592(979-986)Online publication date: 20-Oct-2020
https://dl.acm.org/doi/10.1145/3379337.3415592
LaViola Jr. JBuchanan SPittman C(2014)Multimodal Input for Perceptual User InterfacesInteractive Displays10.1002/9781118706237.ch9(285-312)Online publication date: 12-Jul-2014
https://doi.org/10.1002/9781118706237.ch9
Mahelaqua Basson SRajput NShrivastava KSrivastava SThomas JBruckman ACounts SLampe CTerveen L(2013)Community-oriented spoken web browser for low iiterate usersProceedings of the 2013 conference on Computer supported cooperative work10.1145/2441776.2441833(503-514)Online publication date: 23-Feb-2013
https://dl.acm.org/doi/10.1145/2441776.2441833
Moreno-Daniel AWilpon JJuang B(2012)Index-based incremental language model for scalable directory assistanceSpeech Communication10.1016/j.specom.2011.09.00654:3(351-367)Online publication date: 1-Mar-2012
https://dl.acm.org/doi/10.1016/j.specom.2011.09.006
Jones M(2011)Classic and Alternative Mobile SearchInternational Journal of Mobile Human Computer Interaction10.4018/jmhci.20110101023:1(22-36)Online publication date: 1-Jan-2011
https://dl.acm.org/doi/10.4018/jmhci.2011010102
Feng JJohnston MBangalore S(2011)Speech and Multimodal Interaction in Mobile SearchIEEE Signal Processing Magazine10.1109/MSP.2011.94107328:4(40-49)Online publication date: Jul-2011
https://doi.org/10.1109/MSP.2011.941073
Agarwal SJain AKumar ARajput NDearden AParikh TSubramanian L(2010)The World Wide Telecom Web browserProceedings of the First ACM Symposium on Computing for Development10.1145/1926180.1926185(1-9)Online publication date: 17-Dec-2010
https://dl.acm.org/doi/10.1145/1926180.1926185
Stent AZeljković ICaseiro DWilpon JOstendorf MCollins MNarayanan SOard D(2009)Geo-centric language models for local business voice searchProceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics10.5555/1620754.1620811(389-396)Online publication date: 31-May-2009
https://dl.acm.org/doi/10.5555/1620754.1620811
Paek TLee BThiesson BOppermann REisenhauer MJarke MWulf V(2009)Designing phrase builderProceedings of the 11th International Conference on Human-Computer Interaction with Mobile Devices and Services10.1145/1613858.1613868(1-10)Online publication date: 15-Sep-2009
https://dl.acm.org/doi/10.1145/1613858.1613868
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten