skip to main content
research-article

Machine Learning for Social Multiparty Human--Robot Interaction

Published: 14 October 2014 Publication History

Abstract

We describe a variety of machine-learning techniques that are being applied to social multiuser human--robot interaction using a robot bartender in our scenario. We first present a data-driven approach to social state recognition based on supervised learning. We then describe an approach to social skills execution—that is, action selection for generating socially appropriate robot behavior—which is based on reinforcement learning, using a data-driven simulation of multiple users to train execution policies for social skills. Next, we describe how these components for social state recognition and skills execution have been integrated into an end-to-end robot bartender system, and we discuss the results of a user evaluation. Finally, we present an alternative unsupervised learning framework that combines social state recognition and social skills execution based on hierarchical Dirichlet processes and an infinite POMDP interaction manager. The models make use of data from both human--human interactions collected in a number of German bars and human--robot interactions recorded in the evaluation of an initial version of the system.

References

[1]
David W. Aha, Dennis Kibler, and Marc K. Albert. 1991. Instance-based learning algorithms. Machine Learning 6 (1991), 37--66.
[2]
Haris Baltzakis, Maria Pateraki, and Panos Trahanias. 2012. Visual tracking of hands, faces and facial features of multiple persons. Machine Vision and Applications 23, 6 (2012), 1141--1157.
[3]
Matthew J. Beal, Zoubin Ghahramani, and Carl Edward Rasmussen. 2002. The infinite hidden Markov model. In Advances in Neural Information Processing Systems 14.
[4]
Dan Bohus and Eric Horvitz. 2009a. Dialog in the open world: Platform and applications. In Proceedings of the 11th International Conference on Multimodal Interfaces and the 6th Workshop on Machine Learning for Multimodal Interaction (ICMI-MLMI’09). 31--38.
[5]
Dan Bohus and Eric Horvitz. 2009b. Learning to predict engagement with a spoken dialog system in open-world settings. In Proceedings of the 10th Annual Meeting of the Special Interest Group on Discourse and Dialogue (SIGDIAL’09). 244--252.
[6]
Frank Broz. 2008. Planning for Human-Robot Interaction: Representing Time and Human Intention. Ph.D. Dissertation. Carnegie Mellon University.
[7]
Harry Bunt, Jan Alexandersson, Jean Carletta, Jae-Woong Choe, Alex Chengyu Fang, Koiti Hasida, Kiyong Lee, Volha Petukhova, Andrei Popescu-Belis, Laurent Romary, Claudia Soria, and David Traum. 2010. Towards an ISO standard for dialogue act annotation. In Proceedings of the 7th International Conference on Language Resources and Evaluation (LREC’10).
[8]
Ginevra Castellano, Iolanda Leite, André Pereira, Carlos Martinho, Ana Paiva, and Peter W. McOwan. 2012. Detecting engagement in HRI: An exploration of social and task-based context. In Proceedings of SocialCom’12. 421--428.
[9]
Chih-Chung Chang and Chih-Jen Lin. 2011. LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology 2, 3, Article 27 (May 2011), 27 pages.
[10]
Lonnie Chrisman. 1992. Reinforcement learning with perceptual aliasing: The perceptual distinctions approach. In Proceedings of the 10th National Conference on Artificial Intelligence. 183--188.
[11]
William W. Cohen. 1995. Fast effective rule induction. In 12th International Conference on Machine Learning. Morgan Kaufmann, 115--123.
[12]
Heriberto Cuayáhuitl and Nina Dethlefs. 2011. Spatially-aware dialogue control using hierarchical reinforcement learning. ACM Transactions on Speech and Language Processing 7, 3 (May 2011).
[13]
Heriberto Cuayáhuitl, Steve Renals, Oliver Lemon, and Hiroshi Shimodaira. 2010. Evaluation of a hierarchical reinforcement learning spoken dialogue system. Computer Speech and Language 24 (2010), 395--429.
[14]
Finale Doshi-Velez. 2009. The infinite partially observable Markov decision process. In Advances in Neural Information Processing Systems 22.
[15]
Mary Ellen Foster. 2002. State of the Art Review: Multimodal Fission. Deliverable 6.1. COMIC project.
[16]
Mary Ellen Foster, Andre Gaschler, and Manuel Giuliani. 2013. How can I help you? Comparing engagement classification strategies for a robot bartender. In Proceedings of the 15th ACM International Conference on Multimodal Interaction (ICMI’13).
[17]
Mary Ellen Foster, Andre Gaschler, Manuel Giuliani, Amy Isard, Maria Pateraki, and Ronald P. A. Petrick. 2012. Two people walk into a bar: Dynamic multi-party social interaction with a robot agent. In Proceedings of the 14th ACM International Conference on Multimodal Interaction (ICMI’12).
[18]
Emily B. Fox, Erik B. Sudderth, Michael I. Jordan, and Alan S. Willsky. 2011. A sticky HDP-HMM with application to speaker diarization. Annals of Applied Statistics 5, 2A (2011), 1020--1056.
[19]
Eibe Frank, Yong Wang, Stuart Inglis, Geoffrey Holmes, and Ian H. Witten. 1998. Using model trees for classification. Machine Learning 32, 1 (1998), 63--76.
[20]
Mark Hall, Eibe Frank, Geoffrey Holmes, Bernhard Pfahringer, Peter Reutemann, and Ian H. Witten. 2009. The WEKA data mining software: An update. Special Interest Group on Knowledge Discovery and Data Mining (SIGKDD) Explorations Newsletter 11, 1 (Nov. 2009), 10--18.
[21]
Mark A. Hall. 2000. Correlation-based feature selection for discrete and numeric class machine learning. In Proceedings of the 17th International Conference on Machine Learning (ICML’00). 359--366.
[22]
Mark A. Hall and Geoffrey Holmes. 2003. Benchmarking attribute selection techniques for discrete class data mining. IEEE Transactions on Knowledge and Data Engineering 15, 6 (2003), 1437--1447.
[23]
Michael S. Hamada, Alyson Wilson, C. Shane Reese, and Harry Martz. 2008. Bayesian Reliability. Springer.
[24]
W. Keith Hastings. 1970. Monte Carlo sampling methods using Markov chains and their applications. Biometrika 57, 1 (1970), 97--109.
[25]
Chih-Wei Hsu, Chih-Chung Chang, and Chih-Jen Lin. 2010. A Practical Guide to Support Vector Classification. Technical Report. Department of Computer Science, National Taiwan University. Available at http://www.csie.ntu.edu.tw/∼cjlin/papers/guide/guide.pdf.
[26]
Hemant Ishwaran and Lancelot F. James. 2001. Gibbs sampling methods for stick-breaking priors. Journal of the American Statistical Association 96 (2001), 161--173.
[27]
Hemant Ishwaran and Mahmoud Zarepour. 2002. Exact and approximate sum representations for the Dirichlet process. Canadian Journal of Statistics 3, 2 (2002), 269--283.
[28]
George H. John and Pat Langley. 1995. Estimating continuous distributions in Bayesian classifiers. In Proceedings of the11th Conference on Uncertainty in Artificial Intelligence. 338--345.
[29]
Matthew J. Johnson and Alan Willsky. 2010. The hierarchical Dirichlet process hidden semi-Markov model. In Proceedings of the 26th Conference on Uncertainty in Artificial Intelligence (UAI’10).
[30]
Simon Keizer, Mary Ellen Foster, Oliver Lemon, Andre Gaschler, and Manuel Giuliani. 2013a. Training and evaluation of an MDP model for social multi-user human-robot interaction. In Proceedings of the 14th Annual Meeting of the Special Interest Group on Discourse and Dialogue (SIGDIAL’13).
[31]
Simon Keizer, Pantelis Kastoris, Mary Ellen Foster, Amol Deshmukh, and Oliver Lemon. 2013b. User evaluation of a multi-user social interaction model implemented on a Nao robot. In Proceedings of the International Conference on Social Robotics (ICSR’13) Workshop on Robots in Public Spaces.
[32]
David Klotz, Johannes Wienke, Julia Peltason, Britta Wrede, Sebastian Wrede, Vasil Khalidov, and Jean-Marc Odobez. 2011. Engagement-based multi-party dialog with a humanoid robot. In Proceedings of the 12th Annual Meeting of the Special Interest Group on Discourse and Dialogue (SIGDIAL’11).
[33]
Ron Kohavi and George H John. 1997. Wrappers for feature subset selection. Artificial Intelligence 97, 1 (1997), 273--324.
[34]
Saskia le Cessie and Johannes C. van Houwelingen. 1992. Ridge estimators in logistic regression. Applied Statistics 41, 1 (1992), 191--201.
[35]
Liyuan Li, Qianli Xu, and Yeow Kee Tan. 2012. Attention-based addressee selection for service and social robots to interact with multiple persons. In Proceedings of the 5th ACM SIGGRAPH Conference and Exhibition on Computer Graphics and Interactive Techniques in Asia: Proceedings of the Workshop at SIGGRAPH Asia (WASA’12). 131--136.
[36]
Pierre Lison. 2011. Multi-policy dialogue management. In Proceedings of the 12th Annual Meeting of the Special Interest Group on Discourse and Dialogue (SIGDIAL 2011). Portland, OR.
[37]
Sebastian Loth, Kerstin Huth, and Jan P. De Ruiter. 2013. Automatic detection of service initiation signals used in bars. Frontiers in Psychology 4, 557 (2013).
[38]
Zachary M. MacHardy, Kenneth Syharath, and Prasun Dewan. 2012. Engagement analysis through computer vision. In Proceedings of CollaborateCom 2012. 535--539.
[39]
Sridhar Mahadevan. 1998. Partially observable semi-Markov decision processes: Theory and applications in engineering and cognitive science. In Proceedings of the Association for the Advancement of Artificial Intelligence (AAAI) 1998 Fall Symposium: Planning with Partially Observable Markov Decision Processes.
[40]
Tomonari Masada, Daiji Fukagawa, Atsuhiro Takasu, Yuichiro Shibata, and Kiyoshi Oguri. 2010. Modeling topical trends over continuous time with priors. In Proceedings of the 7th International Conference on Advances in Neural Networks (ISNN 2010) -- Volume Part II. 302--311.
[41]
Derek McColl and Goldie Nejat. 2012. Affect detection from body language during social HRI. In Proceedings of 21st IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN 2012). 1013--1018.
[42]
Nicholas Metropolis, Arianna W. Rosenbluth, Marshall N. Rosenbluth, Augusta H. Teller, and Edward Teller. 1953. Equation of state calculations by fast computing machines. Journal of Chemical Physics 21 (1953), 1087--1092.
[43]
Eric W. Noreen. 1989. Computer-Intensive Methods for Testing Hypotheses: An Introduction. Wiley-Interscience.
[44]
Maria Pateraki, Markos Sigalas, Georgios Chliveros, and Panos Trahanias. 2013. Visual human-robot communication in social settings. In Proceedings of ICRA Workshop on Semantics, Identification and Control of Robot-Human-Environment Interaction.
[45]
Ronald P. A. Petrick and Mary Ellen Foster. 2013. Planning for social interaction in a robot bartender domain. In Proceedings of the International Conference on Automated Planning and Scheduling (ICAPS 2013), Special Track on Novel Applications. Rome, Italy.
[46]
Ronald P. A. Petrick, Mary Ellen Foster, and Amy Isard. 2012. Social state recognition and knowledge-level planning for human-robot interaction in a bartender domain. In Proceedings of the Association for the Advancement of Artificial Intelligence (AAAI) 2012 Workshop on Grounding Language for Physical Systems.
[47]
Joelle Pineau, Nicholas Roy, and Sebastian Thrun. 2001. A hierarchical approach to POMDP planning and execution. In Proceedings of the ICML Workshop on Hierarchy and Memory in Reinforcement Learning.
[48]
Ross Quinlan. 1993. C4.5: Programs for Machine Learning. Morgan Kaufmann, San Mateo, CA.
[49]
Verena Rieser and Oliver Lemon. 2011. Learning and evaluation of dialogue strategies for new applications: Empirical methods for optimization from small data sets. Computational Linguistics 37, 1 (2011), 153--196.
[50]
Ethan O. Selfridge, Iker Arizmendi, Peter A. Heeman, and Jason D. Williams. 2012. Integrating incremental speech recognition and POMDP-based dialogue systems. In Proceedings of the 13th Annual Meeting of the Special Interest Group on Discourse and Dialogue (SIGDial’12). 275--279.
[51]
Jayaram Sethuraman. 1994. A constructive definition of Dirichlet priors. Statistica Sinica 4 (1994), 639--650.
[52]
Matthijs T. J. Spaan and Nikos Vlassis. 2005. Perseus: Randomized point-based value iteration for POMDPs. Journal of Artificial Intelligence Research 24 (2005), 195--220.
[53]
Richard S. Sutton and Andrew G. Barto. 1998. Reinforcement Learning: An Introduction. MIT Press.
[54]
Yee Whye Teh, Michael I. Jordan, Matthew J. Beal, and David M. Blei. 2006. Hierarchical Dirichlet processes. Journal of the American Statistical Association 101, 476 (2006), 1566--1581.
[55]
Blaise Thomson and Steve Young. 2010. Bayesian update of dialogue state: A POMDP framework for spoken dialogue systems. Computer Speech and Language 24, 4 (2010), 562--588.
[56]
Zhuoran Wang and Oliver Lemon. 2012. A nonparametric Bayesian approach to learning multimodal interaction management. In Proceedings of the 4th IEEE Workshop on Spoken Language Technology (SLT’12).
[57]
Michael White. 2006. Efficient realization of coordinate structures in combinatory categorial grammar. Research on Language and Computation 4, 1 (2006), 39--75.
[58]
Jason D. Williams and Steve Young. 2007. Partially observable Markov decision processes for spoken dialog systems. Computer Speech and Language 21, 2 (2007), 393--422.
[59]
Peter Wittenburg, Hennie Brugman, Albert Russel, Alex Klassmann, and Han Sloetjes. 2006. ELAN: A professional framework for multimodality research. In Proceedings of the 5th International Conference on Language Resources and Evaluation (LREC’06).
[60]
Steve Young, Milica Gašić, Simon Keizer, François Mairesse, Blaise Thomson, and Kai Yu. 2010. The hidden information state model: A practical framework for POMDP based spoken dialogue management. Computer Speech and Language 24, 2 (2010), 150--174.

Cited By

View all
  • (2024)A Multi-party Conversational Social Robot Using LLMsCompanion of the 2024 ACM/IEEE International Conference on Human-Robot Interaction10.1145/3610978.3641112(1273-1275)Online publication date: 11-Mar-2024
  • (2023)Joint Engagement Classification using Video Augmentation Techniques for Multi-person HRI in the wildProceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems10.5555/3545946.3598702(698-707)Online publication date: 30-May-2023
  • (2023)A Theoretical Approach to Designing Interactive Robots, Using Restaurant Assistants as an Example2023 20th International Conference on Ubiquitous Robots (UR)10.1109/UR57808.2023.10202555(980-985)Online publication date: 25-Jun-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Interactive Intelligent Systems
ACM Transactions on Interactive Intelligent Systems  Volume 4, Issue 3
Special Issue on Multiple Modalities in Interactive Systems and Robots
October 2014
115 pages
ISSN:2160-6455
EISSN:2160-6463
DOI:10.1145/2660857
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 14 October 2014
Accepted: 01 March 2014
Revised: 01 January 2014
Received: 01 March 2013
Published in TIIS Volume 4, Issue 3

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Social robotics
  2. machine learning
  3. multiuser interaction

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)41
  • Downloads (Last 6 weeks)5
Reflects downloads up to 20 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)A Multi-party Conversational Social Robot Using LLMsCompanion of the 2024 ACM/IEEE International Conference on Human-Robot Interaction10.1145/3610978.3641112(1273-1275)Online publication date: 11-Mar-2024
  • (2023)Joint Engagement Classification using Video Augmentation Techniques for Multi-person HRI in the wildProceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems10.5555/3545946.3598702(698-707)Online publication date: 30-May-2023
  • (2023)A Theoretical Approach to Designing Interactive Robots, Using Restaurant Assistants as an Example2023 20th International Conference on Ubiquitous Robots (UR)10.1109/UR57808.2023.10202555(980-985)Online publication date: 25-Jun-2023
  • (2023)Development and Validation of a Motion Dictionary to Create Emotional Gestures for the NAO Robot2023 32nd IEEE International Conference on Robot and Human Interactive Communication (RO-MAN)10.1109/RO-MAN57019.2023.10309372(897-902)Online publication date: 28-Aug-2023
  • (2023)Come Closer: The Effects of Robot Personality on Human Proxemics Behaviours2023 32nd IEEE International Conference on Robot and Human Interactive Communication (RO-MAN)10.1109/RO-MAN57019.2023.10309333(2610-2616)Online publication date: 28-Aug-2023
  • (2022)Demonstration of a Robot Receptionist with Multi-party Situated InteractionProceedings of the 2022 ACM/IEEE International Conference on Human-Robot Interaction10.5555/3523760.3523975(1202-1203)Online publication date: 7-Mar-2022
  • (2022)Multi-party Interaction with a Robot ReceptionistProceedings of the 2022 ACM/IEEE International Conference on Human-Robot Interaction10.5555/3523760.3523907(927-931)Online publication date: 7-Mar-2022
  • (2022)Towards Parallel Selective Attention Using Psychophysiological States as the Basis for Functional CognitionSensors10.3390/s2218700222:18(7002)Online publication date: 15-Sep-2022
  • (2022)Conversational AI for multi-agent communication in Natural LanguageAI Communications10.3233/AIC-22014735:4(295-308)Online publication date: 1-Jan-2022
  • (2022)We are all Individuals: The Role of Robot Personality and Human Traits in Trustworthy Interaction2022 31st IEEE International Conference on Robot and Human Interactive Communication (RO-MAN)10.1109/RO-MAN53752.2022.9900772(538-545)Online publication date: 29-Aug-2022
  • Show More Cited By

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media