skip to main content
10.1145/3242969.3264973acmotherconferencesArticle/Chapter ViewAbstractPublication Pagesicmi-mlmiConference Proceedingsconference-collections
research-article
Public Access

Unobtrusive Analysis of Group Interactions without Cameras

Published:02 October 2018Publication History

ABSTRACT

Group meetings are often inefficient, unorganized and poorly documented. Factors including "group-think," fear of speaking, unfocused discussion, and bias can affect the performance of a group meeting. In order to actively or passively facilitate group meetings, automatically analyzing group interaction patterns is critical. Existing research on group dynamics analysis still heavily depends on video cameras in the lines of sight of participants or wearable sensors, both of which could affect the natural behavior of participants. In this thesis, we present a smart meeting room that combines microphones and unobtrusive ceiling-mounted Time-of-Flight (ToF) sensors to understand group dynamics in team meetings. Since the ToF sensors are ceiling-mounted and out of the lines of sight of the participants, we posit that their presence would not disrupt the natural interaction patterns of individuals. We collect a new multi-modal dataset of group interactions where participants have to complete a task by reaching a group consensus, and then fill out a post-task questionnaire. We use this dataset for the development of our algorithms and analysis of group meetings. In this paper, we combine the ceiling-mounted ToF sensors and lapel microphones to: (1) estimate the seated body orientation of participants, (2) estimate the head pose and visual focus of attention (VFOA) of meeting participants, (3) estimate the arm pose and body posture of participants, and (4) analyze the multimodal data for passive understanding of group meetings, with a focus on perceived leadership and contribution.

References

  1. S Afshari, TK Woodstock, MHT Imam, S Mishra, AC Sanderson, and RJ Radke . 2015. The Smart Conference Room: An Integrated System Testbed for Efficient, Occupancy-aware Lighting Control. In ACM Int. Conf. Embedded Syst. Energy-Efficient Built Environments. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Sileye O Ba and Jean-Marc Odobez. 2011. Multiperson visual focus of attention from head pose and meeting contextual cues. IEEE Trans. Pattern Anal. and Mach. Intell. Vol. 33, 1 (2011), 101--116. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Cigdem Beyan, Francesca Capozzi, Cristina Becchio, and Vittorio Murino. 2017. Multi-task learning of social psychology assessments and nonverbal features for automatic leadership identification. In Proc. 19th Int. Conf. Multimodal Interaction. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Cigdem Beyan, Francesca Capozzi, Cristina Becchio, and Vittorio Murino. 2018. Prediction of the leadership style of an emergent leader using audio and visual nonverbal features. IEEE Trans. Multimedia Vol. 20, 2 (2018), 441--456. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Indrani Bhattacharya, Noam Eshed, and Richard J Radke. 2017. Privacy-Preserving Understanding of Human Body Orientation for Smart Meetings Int. Conf. Comput. Vision Pattern Regognition Workshops. IEEE.Google ScholarGoogle Scholar
  6. Indrani Bhattacharya, Michael Foley, Ni Zhang, Tongtao Zhang, Christine Ku, Cameron Mine, Heng Ji, Christoph Riedl, Brooke F. Welles, and Richard J. Radke. 2018. A multimodal-sensor-enabled room for unobtrusive group meeting analysis Proc. Int. Conf. Multimodal Interaction. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Indrani Bhattacharya and Richard J Radke. 2016. Arrays of single pixel time-of-flight sensors for privacy preserving tracking and coarse pose estimation. In Proc. Winter Conf. Appl. Comput. Vision. IEEE.Google ScholarGoogle ScholarCross RefCross Ref
  8. Konstantinos Bousmalis, Marc Mehu, and Maja Pantic. 2013. Towards the automatic detection of spontaneous agreement and disagreement based on nonverbal behaviour: A survey of related cues, databases, and tools. Image and Vision Computing Vol. 31, 2 (2013), 203--221. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Susanne Burger, Victoria MacLaren, and Hua Yu. 2002. The ISL meeting corpus: The impact of meeting type on speech style INTERSPEECH. Denver, CO.Google ScholarGoogle Scholar
  10. Nick Campbell, Toshiyuki Sadanobu, Masataka Imura, Naoto Iwahashi, Suzuki Noriko, and Damien Douxchamps. 2006. A multimedia database of meetings and informal interactions for tracking participant involvement and discourse flow. In Proc. Int. Conf. Lang. Resources Evaluation. Genoa, Italy.Google ScholarGoogle Scholar
  11. Ming Ming Chiu and Nale Lehmann-Willenbrock . 2016. Statistical discourse analysis: Modeling sequences of individual actions during group interactions across time. Group Dynamics: Theory, Research, and Practice Vol. 20, 3 (2016), 242.Google ScholarGoogle ScholarCross RefCross Ref
  12. Matthew A Cronin, Laurie R Weingart, and Gergana Todorova . 2011. Dynamics in groups: Are we there yet? Academy of Management Annals Vol. 5, 1 (2011), 571--612.Google ScholarGoogle ScholarCross RefCross Ref
  13. Daniel Gatica-Perez, Alessandro Vinciarelli, and Jean-Marc Odobez . 2014. Nonverbal behavior analysis. In Multimodal Interactive Syst. Manage. EPFL Press, 165--187.Google ScholarGoogle Scholar
  14. Liuhao Ge, Hui Liang, Junsong Yuan, and Daniel Thalmann . 2018. Real-time 3D hand pose estimation with 3D Convolutional Neural Networks. IEEE Trans. Pattern Anal. Mach. Intell. (2018).Google ScholarGoogle Scholar
  15. Jay Hall and Wilfred Harvey Watson . 1970. The effects of a normative intervention on group decision-making performance. Human Relations Vol. 23, 4 (1970), 299--317.Google ScholarGoogle ScholarCross RefCross Ref
  16. Jinni A Harrigan, Thomas E Oxman, and Robert Rosenthal . 1985. Rapport expressed through nonverbal behavior. J. Nonverbal Behavior Vol. 9, 2 (1985), 95--110.Google ScholarGoogle ScholarCross RefCross Ref
  17. Benjamin Herndon and Kyle Lewis . 2015. Applying sequence methods to the study of team temporal dynamics. Organizational Psychology Review Vol. 5, 4 (2015), 318--332.Google ScholarGoogle ScholarCross RefCross Ref
  18. Hayley Hung and Daniel Gatica-Perez . 2010. Estimating cohesion in small groups using audio-visual nonverbal behavior. IEEE Trans. Multimedia Vol. 12, 6 (2010), 563--575. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Hayley Hung, Yan Huang, Gerald Friedland, and Daniel Gatica-Perez . 2011. Estimating dominance in multi-party meetings using speaker diarization. IEEE Trans. Audio, Speech, and Language Process. Vol. 19, 4 (2011), 847--860. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Umar Iqbal, Pavlo Molchanov, Thomas Breuel, Juergen Gall, and Jan Kautz . 2018. Hand pose estimation via Latent 2.5D heatmap regression. arXiv preprint arXiv:1804.09534 (2018).Google ScholarGoogle Scholar
  21. Adam Janin, Don Baron, Jane Edwards, Dan Ellis, David Gelbart, Nelson Morgan, Barbara Peskin, Thilo Pfau, Elizabeth Shriberg, Andreas Stolcke, et almbox. . 2003. The ICSI meeting corpus. In Int. Conf. Acoust., Speech, and Signal Process. Hong Kong, China.Google ScholarGoogle Scholar
  22. Dineshbabu Jayagopi, Dairazalia Sanchez-Cortes, Kazuhiro Otsuka, Junji Yamato, and Daniel Gatica-Perez . 2012. Linking speaking and looking behavior patterns with group composition, perception, and performance. In Proc. 14th ACM Int. Conf. Multimodal Interaction. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Li Jia and Richard J Radke . 2014. Using Time-of-Flight measurements for privacy-preserving tracking in a smart room. IEEE Trans. Ind. Informat. Vol. 10, 1 (2014), 689--696.Google ScholarGoogle ScholarCross RefCross Ref
  24. Natasa Jovanovic, Rieks op den Akker, and Anton Nijholt . 2006. A corpus for studying addressing behaviour in multi-party dialogues. Language Resources and Evaluation Vol. 40, 1 (2006), 5--23.Google ScholarGoogle ScholarCross RefCross Ref
  25. Taemie Kim, Erin McFee, Daniel Olguin Olguin, Ben Waber, Alex Pentland, et almbox. . 2012. Sociometric badges: Using sensor technology to capture new forms of collaboration. J. Organizational Behavior Vol. 33, 3 (Jan. . 2012), 412--427.Google ScholarGoogle ScholarCross RefCross Ref
  26. Steve WJ Kozlowski . 2015. Advancing research on team process dynamics: Theoretical, methodological, and measurement considerations. Organizational Psychology Review Vol. 5, 4 (2015), 270--299.Google ScholarGoogle ScholarCross RefCross Ref
  27. Jin Joo Lee, Brad Knox, Jolie Baumann, Cynthia Breazeal, and David DeSteno . 2013. Computationally modeling interpersonal trust. Frontiers in Psychology Vol. 4 (2013), 893.Google ScholarGoogle ScholarCross RefCross Ref
  28. Roger Th AJ Leenders, Noshir S Contractor, and Leslie A DeChurch . 2016. Once upon a time: Understanding team processes as relational event networks. Organizational Psychology Review Vol. 6, 1 (2016), 92--115.Google ScholarGoogle ScholarCross RefCross Ref
  29. Nale Lehmann-Willenbrock, Hayley Hung, and Joann Keyton . 2017. New frontiers in analyzing dynamic group interactions: Bridging social and computer science. Small Group Research Vol. 48, 5 (2017), 519--531.Google ScholarGoogle ScholarCross RefCross Ref
  30. Shobhit Mathur, Marshall Scott Poole, Feniosky Pena-Mora, Mark Hasegawa-Johnson, and Noshir Contractor . 2012. Detecting interaction links in a collaborating group using manually annotated data. Social Networks Vol. 34, 4 (2012), 515--526.Google ScholarGoogle ScholarCross RefCross Ref
  31. Wenxuan Mou, Hatice Gunes, and Ioannis Patras . 2016. Alone versus in-a-group: A comparative analysis of facial affect recognition Proc. ACM Multimedia Conf. ACM, 521--525. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Philipp Müller, Michael Xuelin Huang, and Andreas Bulling . 2018. Detecting low rapport during natural interactions in small groups from non-Verbal behaviour. arXiv preprint arXiv:1801.06055 (2018).Google ScholarGoogle Scholar
  33. Markus Oberweger and Vincent Lepetit . 2017. DeepPriorGoogle ScholarGoogle Scholar
  34. : Improving fast and accurate 3D hand pose estimation Int. Conf. Comput. Vision Workshops, Vol. Vol. 840. 2.Google ScholarGoogle Scholar
  35. Catharine Oertel, Kenneth A Funes Mora, Samira Sheikhi, Jean-Marc Odobez, and Joakim Gustafson . 2014. Who will get the grant?: A multimodal corpus for the analysis of conversational behaviours in group interviews. In Proc. Workshop Understanding Modeling Multiparty, Multimodal Interactions. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Iason Oikonomidis, Nikolaos Kyriazis, and Antonis A Argyros . 2011. Efficient model-based 3D tracking of hand articulations using Kinect. BmVC, Vol. Vol. 1. 3.Google ScholarGoogle ScholarCross RefCross Ref
  37. Kazuhiro Otsuka, Shoko Araki, Kentaro Ishizuka, Masakiyo Fujimoto, Martin Heinrich, and Junji Yamato . 2008. A realtime multimodal system for analyzing group meetings by combining face pose tracking and speaker diarization. In Proc. Int. Conf. Multimodal Interfaces. ACM, Crete, Greece. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Kazuhiro Otsuka, Hiroshi Sawada, and Junji Yamato . 2007. Automatic inference of cross-Modal nonverbal interactions in multiparty conversations: Who responds to whom, when, and how? From gaze, head gestures, and utterances. In Proc. Int. Conf. Multimodal Interfaces. ACM, Aichi, Japan. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Kazuhiro Otsuka, Yoshinao Takemae, and Junji Yamato . 2005. A probabilistic inference of multiparty-conversation structure based on Markov-switching models of gaze patterns, head directions, and utterances. In Proc. 7th Int. Conf. Multimodal Interfaces. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Kazuhiro Otsuka, Junji Yamato, Yoshinao Takemae, and Hiroshi Murase . 2006. Conversation scene analysis with dynamic Bayesian Network based on visual head tracking Proc. Int. Conf. Multimedia and Expo. IEEE, Toronto, ON, Canada.Google ScholarGoogle Scholar
  41. George Papandreou, Tyler Zhu, Liang-Chieh Chen, Spyros Gidaris, Jonathan Tompson, and Kevin Murphy . 2018. PersonLab: Person pose estimation and instance segmentation with a bottom-up, part-based, geometric embedding model. CoRR Vol. abs/1803.08225 (2018).Google ScholarGoogle Scholar
  42. George Papandreou, Tyler Zhu, Nori Kanazawa, Alexander Toshev, Jonathan Tompson, Chris Bregler, and Kevin P. Murphy . 2017. Towards accurate multi-person pose estimation in the wild. CoRR Vol. abs/1701.01779 (2017).Google ScholarGoogle Scholar
  43. Dairazalia Sanchez-Cortes, Oya Aran, and Daniel Gatica-Perez . 2011. An audio visual corpus for emergent leader analysis Workshop Multimodal Corpora Mach. Learning: Taking Stock and Road Mapping the Future. Alicante, Spain.Google ScholarGoogle Scholar
  44. George Saon, Gakuto Kurata, Tom Sercu, Kartik Audhkhasi, Samuel Thomas, Dimitrios Dimitriadis, Xiaodong Cui, Bhuvana Ramabhadran, Michael Picheny, Lynn-Li Lim, Bergul Roomi, and Phil Hall . 2017. English Conversational Telephone Speech Recognition by Humans and Machines Proc. INTERSPEECH.Google ScholarGoogle Scholar
  45. Stefan Scherer, Nadir Weibel, Louis-Philippe Morency, and Sharon Oviatt . 2012. Multimodal prediction of expertise and leadership in learning groups Int. Workshop Multimodal Learning Analytics. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Rainer Stiefelhagen, Jie Yang, and Alex Waibel . 2002. Modeling focus of attention for meeting indexing based on multiple cues. Trans. Neural Netw. Vol. 13, 4 (2002), 928--938. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Thomas J. L. van Rompay, Dorette J. Vonk, and Marieke L. Fransen . 2009. The eye of the camera: Effects of security cameras on prosocial behavior. Environment and Behavior Vol. 41, 1 (2009), 60--74.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Unobtrusive Analysis of Group Interactions without Cameras

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Other conferences
        ICMI '18: Proceedings of the 20th ACM International Conference on Multimodal Interaction
        October 2018
        687 pages
        ISBN:9781450356923
        DOI:10.1145/3242969

        Copyright © 2018 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 2 October 2018

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        ICMI '18 Paper Acceptance Rate63of149submissions,42%Overall Acceptance Rate453of1,080submissions,42%
      • Article Metrics

        • Downloads (Last 12 months)30
        • Downloads (Last 6 weeks)9

        Other Metrics

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader