research-article

Embodied Collaborative Referring Expression Generation in Situated Human-Robot Interaction

Authors:
Rui Fang

Michigan State University, East Lansing, MI, USA

Michigan State University, East Lansing, MI, USA
View Profile

,
Malcolm Doering

Michigan State University, East Lansing, MI, USA

Michigan State University, East Lansing, MI, USA
View Profile

,
Joyce Y. Chai

Michigan State University, East Lansing, MI, USA

Michigan State University, East Lansing, MI, USA
View Profile

HRI '15: Proceedings of the Tenth Annual ACM/IEEE International Conference on Human-Robot InteractionMarch 2015Pages 271–278https://doi.org/10.1145/2696454.2696467

Published:02 March 2015Publication History

HRI '15: Proceedings of the Tenth Annual ACM/IEEE International Conference on Human-Robot Interaction

Pages 271–278

ABSTRACT

To facilitate referential communication between humans and robots and mediate their differences in representing the shared environment, we are exploring embodied collaborative models for referring expression generation (REG). Instead of a single minimum description to describe a target object, episodes of expressions are generated based on human feedback during human-robot interaction. We particularly investigate the role of embodiment such as robot gesture behaviors (i.e., pointing to an object) and human's gaze feedback (i.e., looking at a particular object) in the collaborative process. This paper examines different strategies of incorporating embodiment and collaboration in REG and discusses their possibilities and challenges in enabling human-robot referential communication.

References

J. Y. Chai, L. She, R. Fang, S. Ottarson, C. Littley, C. Liu, and K. Hanson. Collaborative e'ort towards common ground in situated human robot dialogue. In Proceedings of 9th ACM/IEEE International Conference on Human-Robot Interaction, Bielefeld, Germany, 2014. Google ScholarDigital Library
H. Clark and A. Bangerter. Changing ideas about reference, pages 25--49. Experimental pragmatics. Palgrave Macmillan, 2004.Google Scholar
H. Clark and S. Brennan. Grounding in communication. Perspectives on socially shared cognition, 13:127--149, 1991.Google Scholar
H. H. Clark and D. Wilkes-Gibbs. Referring as a collaborative process. Cognition, 22:1--39, 1986.Google ScholarCross Ref
R. Dale. Computational interpretations of the gricean maxims in the generation of referring expressions. Cognitive Science, 19:233--263, 1995.Google ScholarCross Ref
D. DeVault, N. Kariaeva, A. Kothari, I. Oved, and M. Stone. An information-state approach to collaborative reference. In Proceedings of the ACL 2005 on Interactive Poster and Demonstration Sessions, 2005. Google ScholarDigital Library
R. Fang, M. Doering, and J. Y. Chai. Collaborative models for referring expression generation in situated dialogue. In Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence, July 27-31, 2014, Quebec City, Quebec, Canada., pages 1544--1550, 2014.Google ScholarDigital Library
R. Fang, C. Liu, L. She, and J. Y. Chai. Towards situated dialogue: Revisiting referring expression generation. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 392--402, Seattle, Washington, USA, October 2013. Association for Computational Linguistics.Google Scholar
C. J. Fillmore. Towards a descriptive framework for spatial deixis. In R. J. Jarvella and W. Klein, editors, Speech, Place, and Action, pages 31--59. Wiley, Chichester, 1982.Google Scholar
P. M. Fitts. The information capacity of the human motor system in controlling the amplitude of movement. Journal of Experimental Psychology, 74:381--391, 1954.Google ScholarCross Ref
A. Gatt. Structuring knowledge for reference generation: A clustering algorithm. In Proceedings of the 11th Conference of the European Chapter of the Association for Computational Linguistics, Association for Computational Linguistics, pages 321--328, 2006.Google Scholar
A. Gatt and P. Paggio. What and where: An empirical investigation of pointing gestures and descriptions in multimodal referring actions. In Proceedings of the 14th European Workshop on Natural Language Generation, pages 82--91, So'a, Bulgaria, August 2013. Association for Computational Linguistics.Google Scholar
A. Gatt and P. Paggio. Learning when to point: A data-driven approach. In Proceedings of the 25th International Conference on Computational Linguistics (COLING '14), 2014.Google Scholar
S. Goldin-Meadow. The role of gesture in communication and thinking. Trends Cogn. Sci., 1999.Google ScholarCross Ref
P. A. Heeman and G. Hirst. Collaborating on referring expressions. Computational Linguistics, 21:351--382, 1995. Google ScholarDigital Library
S. Kazemzadeh, V. Ordonez, M. Matten, and T. Berg. Referitgame: Referring to objects in photographs of natural scenes. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 787--798, Doha, Qatar, October 2014. Association for Computational Linguistics.Google ScholarCross Ref
A. Koller, M. Staudte, K. Garou', and M. Crocker. Enhancing referential success by tracking hearer gaze. In Proceedings of the 13th Annual Meeting of the Special Interest Group on Discourse and Dialogue, SIGDIAL '12, pages 30--39, Stroudsburg, PA, USA, 2012. Association for Computational Linguistics. Google ScholarDigital Library
E. Krahmer and K. V. Deemter. Computational generation of referring expressions: A survey. computational linguistics, 38(1):173--218, 2012. Google ScholarDigital Library
C. Liu, R. Fang, and J. Y. Chai. Towards mediating shared perceptual basis in situated dialogue. In Proceedings of the 13th Annual Meeting of the Special Interest Group on Discourse and Dialogue, SIGDIAL '12, pages 140--149, Stroudsburg, PA, USA, 2012. Association for Computational Linguistics. Google ScholarDigital Library
C. Liu, R. Fang, L. She, and J. Chai. Modeling collaborative referring for situated referential grounding. In Proceedings of the SIGDIAL 2013 Conference, pages 78--86, Metz, France, August 2013. Association for Computational Linguistics.Google Scholar
I. S. MacKenzie, A. Sellen, and W. A. S. Buxton. A comparison of input devices in element pointing and dragging tasks. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, CHI '91, pages 161--166, New York, NY, USA, 1991. ACM. Google ScholarDigital Library
M. Mitchell, K. van Deemter, and E. Reiter. Generating expressions that refer to visible objects. In Proceedings of NAAC-HLT 2013, pages 1174--1184, 2013.Google Scholar
P. Piwek. Salience in the generation of multimodal referring acts. In Proceedings of the 2009 International Conference on Multimodal Interfaces, ICMI-MLMI '09, pages 207--210, New York, NY, USA, 2009. ACM. Google ScholarDigital Library
A. Sauppe and B. Mutlu. Robot deictics: How gesture and context shape referential communication. In Proceedings of the 2014 ACM/IEEE International Conference on Human-robot Interaction, HRI '14, pages 342--349, New York, NY, USA, 2014. ACM. Google ScholarDigital Library
I. F. V. D. Sluis. Multimodal Reference, Studies in Automatic Generation of Multimodal Referring Expressions. PhD thesis, Tulburg University, 2005.Google Scholar
R. S. Sutton and A. G. Barto. Introduction to Reinforcement Learning. MIT Press, Cambridge, MA, USA, 1st edition, 1998. Google ScholarDigital Library
M. Tanenhaus, M. Spivey-Knowlton, K. Eberhard, and J. Sedivy. Integration of visual and linguistic information during spoken language comprehension. Science, 268:1632--1634, 1995.Google ScholarCross Ref
S. Tellex, R. Knepper, A. Li, D. Rus, and N. Roy. Asking for help using inverse semantics. In Proceedings of Robotics: Science and Systems, Berkeley, USA, July 2014.Google ScholarCross Ref

Index Terms

Embodied Collaborative Referring Expression Generation in Situated Human-Robot Interaction
1. Human-centered computing
  1. Human computer interaction (HCI)

Recommendations

Collaborative effort towards common ground in situated human-robot dialogue
HRI '14: Proceedings of the 2014 ACM/IEEE international conference on Human-robot interaction

In situated human-robot dialogue, although humans and robots are co-present in a shared environment, they have significantly mismatched capabilities in perceiving the shared environment. Their representations of the shared world are misaligned. In order ...
Read More
A Methodology for Evaluating Multimodal Referring Expression Generation for Embodied Virtual Agents
ICMI '23 Companion: Companion Publication of the 25th International Conference on Multimodal Interaction

Robust use of definite descriptions in a situated space often involves recourse to both verbal and non-verbal modalities. For IVAs, virtual agents designed to interact with humans, the ability to both recognize and generate non-verbal and verbal ...
Read More
Comparison of Human-Human and Human-Robot Turn-Taking Behaviour in Multiparty Situated Interaction
UM3I '14: Proceedings of the 2014 workshop on Understanding and Modeling Multiparty, Multimodal Interactions

In this paper, we present an experiment where two human subjects are given a team-building task to solve together with a robot. The setting requires that the speakers' attention is partly directed towards objects on the table between them, as well as to ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
HRI '15: Proceedings of the Tenth Annual ACM/IEEE International Conference on Human-Robot Interaction
March 2015
368 pages
ISBN:9781450328838
DOI:10.1145/2696454
General Chairs:
Julie A. Adams
Vanderbilt University, USA
,
William Smart
Oregon State University, USA
,
Program Chairs:
Bilge Mutlu
University of Wisconsin-Madison, USA
,
Leila Takayama
Google[x], USA
Copyright © 2015 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 2 March 2015
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
collaborative model
human-robot dialogue
referring expression generation
Qualifiers
- research-article
Conference

Acceptance Rates
HRI '15 Paper Acceptance Rate43of169submissions,25%Overall Acceptance Rate242of1,000submissions,24%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 38
  Total Citations
  View Citations
- 506
  Total Downloads
- Downloads (Last 12 months)53
- Downloads (Last 6 weeks)9
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Embodied Collaborative Referring Expression Generation in Situated Human-Robot Interaction

HRI '15: Proceedings of the Tenth Annual ACM/IEEE International Conference on Human-Robot Interaction

ABSTRACT

References

Cited By

Index Terms

Recommendations

Collaborative effort towards common ground in situated human-robot dialogue

A Methodology for Evaluating Multimodal Referring Expression Generation for Embodied Virtual Agents

Comparison of Human-Human and Human-Robot Turn-Taking Behaviour in Multiparty Situated Interaction