ABSTRACT
Generating referring expressions is a task that has received a great deal of attention in the natural-language generation community, with an increasing amount of recent effort targeted at the generation of multimodal referring expressions. However, most implemented systems tend to assume very little shared knowledge between the speaker and the hearer, and therefore must generate fully-elaborated linguistic references. Some systems do include a representation of the physical context or the dialogue context; however, other sources of contextual information are not normally used. Also, the generated references normally consist only of language and, possibly, deictic pointing gestures.
When referring to objects in the context of a task-based interaction involving jointly manipulating objects, a much richer notion of context is available, which permits a wider range of referring options. In particular, when conversational partners cooperate on a mutual task in a shared environment, objects can be made accessible simply by manipulating them as part of the task. We demonstrate that such expressions are common in a corpus of human-human dialogues based on constructing virtual objects, and then describe how this type of reference can be incorporated into the output of a humanoid robot that engages in similar joint construction dialogues with a human partner.
- M. Ariel. The function of accessibility in a theory of grammar. Journal of Pragmatics, 16(5):443--463, 1991. doi:10.1016/0378-2166(91)90136-L.Google ScholarCross Ref
- E. G. Bard and M. P. Aylett. Referential form, word duration, and modeling the listener in spoken dialogue. In J. C. Trueswell and M. K. Tanenhaus, editors, Approaches to Studying World-Situated Language Use: Bridging the Language-as-Product and Language-as-Action Traditions. The MIT Press, 2004.Google Scholar
- A. Belz, A. Gatt, E. Reiter, and J. Viethen, editors. The Attribute Selection for Generation of Referring Expressions Challenge, 2007. http://www.csd.abdn.ac.uk/research/evaluation/.Google Scholar
- R.-J. Beun and A. Cremers. Object reference in a shared domain of conversation. Pragmatics and Cognition, 6(1--2):121--152, 1998.Google Scholar
- D. K. Byron. Understanding referring expressions in situated language: Some challenges for real-world agents. In Proceedings of the First International Workshop on Language Understanding and Agents for Real World Interaction, 2003.Google Scholar
- J. Carletta, C. Nicol, T. Taylor, R. L. Hill, J. P. de Ruiter, and E. G. Bard. Eye-tracking for two-person tasks with manipulation of a virtual world. Behavior Research Methods, under revision.Google Scholar
- R. Dale and E. Reiter. Computational interpretations of the Gricean maxims in the generation of referring expressions. Cognitive Science, 19(2):233--263, 1995. doi:10.1207/s15516709cog1902_3.Google ScholarCross Ref
- M. E. Foster, T. By, M. Rickert, and A. Knoll. Human-robot dialogue for joint construction tasks. In ICMI '06: Proceedings of the 8th international conference on Multimodal interfaces, pages 68--71, Ban, Alberta, Canada, November 2006. doi:10.1145/1180995.1181009. Google ScholarDigital Library
- M. E. Foster and C. Matheson. Representing and using assembly plans in cooperative, task-based human-robot dialogue. 2008. In submission.Google Scholar
- D. Gergle, C. P. Rosé, and R. E. Kraut. Modeling the impact of shared visual information on collaborative reference. In Proceedings of the SIGCHI conference on Human factors in computing systems, pages 1543--1552, 2007. doi:10.1145/1240624.1240858. Google ScholarDigital Library
- J. D. Kelleher and G.-J. M. Kruijff. Incremental generation of spatial referring expressions in situated dialog. In Proceedings of the 44th annual meeting of the ACL (COLING-ACL 2006), pages 1041--1048, 2006. doi:10.3115/1220175.1220306. Google ScholarDigital Library
- A. Kranstedt, A. Lücking, T. Pfeier, H. Rieser, and I. Wachsmuth. Deictic object reference in task-oriented dialogue. In G. Rickheit and I. Wachsmuth, editors, Situated Communication, pages 155--207. Mouton de Gruyter, 2006.Google Scholar
- A. Kranstedt and I. Wachsmuth. Incremental generation of multimodal deixis referring to objects. In Proceedings of the 10th European Workshop on Natural Language Generation (ENLG-05), pages 75--82, Aberdeen, Scotland, August 2005.Google Scholar
- F. Landragin, N. Bellalem, and L. Romary. Referring to objects with spoken and haptic modalities. In Proceedings of the Fourth IEEE International Conference on Multimodal Interfaces, pages 99--104, 2002. doi:10.1109/ICMI.2002.1166976. Google ScholarDigital Library
- P. Piwek. Modality choice for generation of referring acts: Pointing versus describing. In Proceedings of the Workshop on Multimodal Output Generation (MOG 2007), Aberdeen, Scotland, 2007.Google Scholar
- D. C. Richardson, R. Dale, and N. Z. Kirkham. The art of conversation is coordination: common ground and the coupling of eye movements during dialogue. Psychological Science, 18(5):407--413, May 2007. doi:10.1111/j.1467-9280.2007.01914.x.Google ScholarCross Ref
- M. Rickert, M. E. Foster, M. Giuliani, T. By, G. Panin, and A. Knoll. Integrating language, vision and action for human robot dialog systems. In Proceedings of HCI International 2007, Beijing, China, July 2007. doi:10.1007/978-3-540-73281-5_108. Google ScholarDigital Library
- A. J. N. van Breemen. iCat: Experimenting with animabotics. In Proceedings of the AISB 2005 Creative Robotics Symposium, 2005.Google Scholar
- K. van Deemter, I. van der Sluis, and A. Gatt. Building a semantically transparent corpus for the generation of referring expressions. In Proceedings of the 4th International Conference on Natural Language Generation (INLG), Sydney, Australia, 2006. Google ScholarDigital Library
- I. F. van der Sluis. Multimodal Reference: Studies in Automatic Generation of Multimodal Referring Expressions. PhD thesis, University of Tilburg, 2005.Google Scholar
Index Terms
The roles of haptic-ostensive referring expressions in cooperative, task-based human-robot dialogue
Recommendations
The roles and recognition of Haptic-Ostensive actions in collaborative multimodal human-human dialogues
The RoboHelper project has the goal of developing assistive robots for the elderly. One crucial component of such a robot is a multimodal dialogue architecture, since collaborative task-oriented human-human dialogue is inherently multimodal. In this ...
Human-robot collaborative tutoring using multiparty multimodal spoken dialogue
HRI '14: Proceedings of the 2014 ACM/IEEE international conference on Human-robot interactionIn this paper, we describe a project that explores a novel experimental setup towards building a spoken, multi-modally rich, and human-like multiparty tutoring robot. A human-robot interaction setup is designed, and a human-human dialogue corpus is ...
Exploring Temporal Dependencies in Multimodal Referring Expressions with Mixed Reality
Virtual, Augmented and Mixed Reality. Applications and Case StudiesAbstractIn collaborative tasks, people rely both on verbal and non-verbal cues simultaneously to communicate with each other. For human-robot interaction to run smoothly and naturally, a robot should be equipped with the ability to robustly disambiguate ...
Comments