research-article

Public Access

Unobtrusive Analysis of Group Interactions without Cameras

Author:
Indrani Bhattacharya

Rensselaer Polytechnic Institute, Troy, NY, USA

Rensselaer Polytechnic Institute, Troy, NY, USA
View Profile

ICMI '18: Proceedings of the 20th ACM International Conference on Multimodal InteractionOctober 2018Pages 501–505https://doi.org/10.1145/3242969.3264973

Published:02 October 2018Publication History

ICMI '18: Proceedings of the 20th ACM International Conference on Multimodal Interaction

Pages 501–505

ABSTRACT

Group meetings are often inefficient, unorganized and poorly documented. Factors including "group-think," fear of speaking, unfocused discussion, and bias can affect the performance of a group meeting. In order to actively or passively facilitate group meetings, automatically analyzing group interaction patterns is critical. Existing research on group dynamics analysis still heavily depends on video cameras in the lines of sight of participants or wearable sensors, both of which could affect the natural behavior of participants. In this thesis, we present a smart meeting room that combines microphones and unobtrusive ceiling-mounted Time-of-Flight (ToF) sensors to understand group dynamics in team meetings. Since the ToF sensors are ceiling-mounted and out of the lines of sight of the participants, we posit that their presence would not disrupt the natural interaction patterns of individuals. We collect a new multi-modal dataset of group interactions where participants have to complete a task by reaching a group consensus, and then fill out a post-task questionnaire. We use this dataset for the development of our algorithms and analysis of group meetings. In this paper, we combine the ceiling-mounted ToF sensors and lapel microphones to: (1) estimate the seated body orientation of participants, (2) estimate the head pose and visual focus of attention (VFOA) of meeting participants, (3) estimate the arm pose and body posture of participants, and (4) analyze the multimodal data for passive understanding of group meetings, with a focus on perceived leadership and contribution.

References

S Afshari, TK Woodstock, MHT Imam, S Mishra, AC Sanderson, and RJ Radke . 2015. The Smart Conference Room: An Integrated System Testbed for Efficient, Occupancy-aware Lighting Control. In ACM Int. Conf. Embedded Syst. Energy-Efficient Built Environments. Google ScholarDigital Library
Sileye O Ba and Jean-Marc Odobez. 2011. Multiperson visual focus of attention from head pose and meeting contextual cues. IEEE Trans. Pattern Anal. and Mach. Intell. Vol. 33, 1 (2011), 101--116. Google ScholarDigital Library
Cigdem Beyan, Francesca Capozzi, Cristina Becchio, and Vittorio Murino. 2017. Multi-task learning of social psychology assessments and nonverbal features for automatic leadership identification. In Proc. 19th Int. Conf. Multimodal Interaction. ACM. Google ScholarDigital Library
Cigdem Beyan, Francesca Capozzi, Cristina Becchio, and Vittorio Murino. 2018. Prediction of the leadership style of an emergent leader using audio and visual nonverbal features. IEEE Trans. Multimedia Vol. 20, 2 (2018), 441--456. Google ScholarDigital Library
Indrani Bhattacharya, Noam Eshed, and Richard J Radke. 2017. Privacy-Preserving Understanding of Human Body Orientation for Smart Meetings Int. Conf. Comput. Vision Pattern Regognition Workshops. IEEE.Google Scholar
Indrani Bhattacharya, Michael Foley, Ni Zhang, Tongtao Zhang, Christine Ku, Cameron Mine, Heng Ji, Christoph Riedl, Brooke F. Welles, and Richard J. Radke. 2018. A multimodal-sensor-enabled room for unobtrusive group meeting analysis Proc. Int. Conf. Multimodal Interaction. ACM. Google ScholarDigital Library
Indrani Bhattacharya and Richard J Radke. 2016. Arrays of single pixel time-of-flight sensors for privacy preserving tracking and coarse pose estimation. In Proc. Winter Conf. Appl. Comput. Vision. IEEE.Google ScholarCross Ref
Konstantinos Bousmalis, Marc Mehu, and Maja Pantic. 2013. Towards the automatic detection of spontaneous agreement and disagreement based on nonverbal behaviour: A survey of related cues, databases, and tools. Image and Vision Computing Vol. 31, 2 (2013), 203--221. Google ScholarDigital Library
Susanne Burger, Victoria MacLaren, and Hua Yu. 2002. The ISL meeting corpus: The impact of meeting type on speech style INTERSPEECH. Denver, CO.Google Scholar
Nick Campbell, Toshiyuki Sadanobu, Masataka Imura, Naoto Iwahashi, Suzuki Noriko, and Damien Douxchamps. 2006. A multimedia database of meetings and informal interactions for tracking participant involvement and discourse flow. In Proc. Int. Conf. Lang. Resources Evaluation. Genoa, Italy.Google Scholar
Ming Ming Chiu and Nale Lehmann-Willenbrock . 2016. Statistical discourse analysis: Modeling sequences of individual actions during group interactions across time. Group Dynamics: Theory, Research, and Practice Vol. 20, 3 (2016), 242.Google ScholarCross Ref
Matthew A Cronin, Laurie R Weingart, and Gergana Todorova . 2011. Dynamics in groups: Are we there yet? Academy of Management Annals Vol. 5, 1 (2011), 571--612.Google ScholarCross Ref
Daniel Gatica-Perez, Alessandro Vinciarelli, and Jean-Marc Odobez . 2014. Nonverbal behavior analysis. In Multimodal Interactive Syst. Manage. EPFL Press, 165--187.Google Scholar
Liuhao Ge, Hui Liang, Junsong Yuan, and Daniel Thalmann . 2018. Real-time 3D hand pose estimation with 3D Convolutional Neural Networks. IEEE Trans. Pattern Anal. Mach. Intell. (2018).Google Scholar
Jay Hall and Wilfred Harvey Watson . 1970. The effects of a normative intervention on group decision-making performance. Human Relations Vol. 23, 4 (1970), 299--317.Google ScholarCross Ref
Jinni A Harrigan, Thomas E Oxman, and Robert Rosenthal . 1985. Rapport expressed through nonverbal behavior. J. Nonverbal Behavior Vol. 9, 2 (1985), 95--110.Google ScholarCross Ref
Benjamin Herndon and Kyle Lewis . 2015. Applying sequence methods to the study of team temporal dynamics. Organizational Psychology Review Vol. 5, 4 (2015), 318--332.Google ScholarCross Ref
Hayley Hung and Daniel Gatica-Perez . 2010. Estimating cohesion in small groups using audio-visual nonverbal behavior. IEEE Trans. Multimedia Vol. 12, 6 (2010), 563--575. Google ScholarDigital Library
Hayley Hung, Yan Huang, Gerald Friedland, and Daniel Gatica-Perez . 2011. Estimating dominance in multi-party meetings using speaker diarization. IEEE Trans. Audio, Speech, and Language Process. Vol. 19, 4 (2011), 847--860. Google ScholarDigital Library
Umar Iqbal, Pavlo Molchanov, Thomas Breuel, Juergen Gall, and Jan Kautz . 2018. Hand pose estimation via Latent 2.5D heatmap regression. arXiv preprint arXiv:1804.09534 (2018).Google Scholar
Adam Janin, Don Baron, Jane Edwards, Dan Ellis, David Gelbart, Nelson Morgan, Barbara Peskin, Thilo Pfau, Elizabeth Shriberg, Andreas Stolcke, et almbox. . 2003. The ICSI meeting corpus. In Int. Conf. Acoust., Speech, and Signal Process. Hong Kong, China.Google Scholar
Dineshbabu Jayagopi, Dairazalia Sanchez-Cortes, Kazuhiro Otsuka, Junji Yamato, and Daniel Gatica-Perez . 2012. Linking speaking and looking behavior patterns with group composition, perception, and performance. In Proc. 14th ACM Int. Conf. Multimodal Interaction. ACM. Google ScholarDigital Library
Li Jia and Richard J Radke . 2014. Using Time-of-Flight measurements for privacy-preserving tracking in a smart room. IEEE Trans. Ind. Informat. Vol. 10, 1 (2014), 689--696.Google ScholarCross Ref
Natasa Jovanovic, Rieks op den Akker, and Anton Nijholt . 2006. A corpus for studying addressing behaviour in multi-party dialogues. Language Resources and Evaluation Vol. 40, 1 (2006), 5--23.Google ScholarCross Ref
Taemie Kim, Erin McFee, Daniel Olguin Olguin, Ben Waber, Alex Pentland, et almbox. . 2012. Sociometric badges: Using sensor technology to capture new forms of collaboration. J. Organizational Behavior Vol. 33, 3 (Jan. . 2012), 412--427.Google ScholarCross Ref
Steve WJ Kozlowski . 2015. Advancing research on team process dynamics: Theoretical, methodological, and measurement considerations. Organizational Psychology Review Vol. 5, 4 (2015), 270--299.Google ScholarCross Ref
Jin Joo Lee, Brad Knox, Jolie Baumann, Cynthia Breazeal, and David DeSteno . 2013. Computationally modeling interpersonal trust. Frontiers in Psychology Vol. 4 (2013), 893.Google ScholarCross Ref
Roger Th AJ Leenders, Noshir S Contractor, and Leslie A DeChurch . 2016. Once upon a time: Understanding team processes as relational event networks. Organizational Psychology Review Vol. 6, 1 (2016), 92--115.Google ScholarCross Ref
Nale Lehmann-Willenbrock, Hayley Hung, and Joann Keyton . 2017. New frontiers in analyzing dynamic group interactions: Bridging social and computer science. Small Group Research Vol. 48, 5 (2017), 519--531.Google ScholarCross Ref
Shobhit Mathur, Marshall Scott Poole, Feniosky Pena-Mora, Mark Hasegawa-Johnson, and Noshir Contractor . 2012. Detecting interaction links in a collaborating group using manually annotated data. Social Networks Vol. 34, 4 (2012), 515--526.Google ScholarCross Ref
Wenxuan Mou, Hatice Gunes, and Ioannis Patras . 2016. Alone versus in-a-group: A comparative analysis of facial affect recognition Proc. ACM Multimedia Conf. ACM, 521--525. Google ScholarDigital Library
Philipp Müller, Michael Xuelin Huang, and Andreas Bulling . 2018. Detecting low rapport during natural interactions in small groups from non-Verbal behaviour. arXiv preprint arXiv:1801.06055 (2018).Google Scholar
Markus Oberweger and Vincent Lepetit . 2017. DeepPriorGoogle Scholar
: Improving fast and accurate 3D hand pose estimation Int. Conf. Comput. Vision Workshops, Vol. Vol. 840. 2.Google Scholar
Catharine Oertel, Kenneth A Funes Mora, Samira Sheikhi, Jean-Marc Odobez, and Joakim Gustafson . 2014. Who will get the grant?: A multimodal corpus for the analysis of conversational behaviours in group interviews. In Proc. Workshop Understanding Modeling Multiparty, Multimodal Interactions. ACM. Google ScholarDigital Library
Iason Oikonomidis, Nikolaos Kyriazis, and Antonis A Argyros . 2011. Efficient model-based 3D tracking of hand articulations using Kinect. BmVC, Vol. Vol. 1. 3.Google ScholarCross Ref
Kazuhiro Otsuka, Shoko Araki, Kentaro Ishizuka, Masakiyo Fujimoto, Martin Heinrich, and Junji Yamato . 2008. A realtime multimodal system for analyzing group meetings by combining face pose tracking and speaker diarization. In Proc. Int. Conf. Multimodal Interfaces. ACM, Crete, Greece. Google ScholarDigital Library
Kazuhiro Otsuka, Hiroshi Sawada, and Junji Yamato . 2007. Automatic inference of cross-Modal nonverbal interactions in multiparty conversations: Who responds to whom, when, and how? From gaze, head gestures, and utterances. In Proc. Int. Conf. Multimodal Interfaces. ACM, Aichi, Japan. Google ScholarDigital Library
Kazuhiro Otsuka, Yoshinao Takemae, and Junji Yamato . 2005. A probabilistic inference of multiparty-conversation structure based on Markov-switching models of gaze patterns, head directions, and utterances. In Proc. 7th Int. Conf. Multimodal Interfaces. ACM. Google ScholarDigital Library
Kazuhiro Otsuka, Junji Yamato, Yoshinao Takemae, and Hiroshi Murase . 2006. Conversation scene analysis with dynamic Bayesian Network based on visual head tracking Proc. Int. Conf. Multimedia and Expo. IEEE, Toronto, ON, Canada.Google Scholar
George Papandreou, Tyler Zhu, Liang-Chieh Chen, Spyros Gidaris, Jonathan Tompson, and Kevin Murphy . 2018. PersonLab: Person pose estimation and instance segmentation with a bottom-up, part-based, geometric embedding model. CoRR Vol. abs/1803.08225 (2018).Google Scholar
George Papandreou, Tyler Zhu, Nori Kanazawa, Alexander Toshev, Jonathan Tompson, Chris Bregler, and Kevin P. Murphy . 2017. Towards accurate multi-person pose estimation in the wild. CoRR Vol. abs/1701.01779 (2017).Google Scholar
Dairazalia Sanchez-Cortes, Oya Aran, and Daniel Gatica-Perez . 2011. An audio visual corpus for emergent leader analysis Workshop Multimodal Corpora Mach. Learning: Taking Stock and Road Mapping the Future. Alicante, Spain.Google Scholar
George Saon, Gakuto Kurata, Tom Sercu, Kartik Audhkhasi, Samuel Thomas, Dimitrios Dimitriadis, Xiaodong Cui, Bhuvana Ramabhadran, Michael Picheny, Lynn-Li Lim, Bergul Roomi, and Phil Hall . 2017. English Conversational Telephone Speech Recognition by Humans and Machines Proc. INTERSPEECH.Google Scholar
Stefan Scherer, Nadir Weibel, Louis-Philippe Morency, and Sharon Oviatt . 2012. Multimodal prediction of expertise and leadership in learning groups Int. Workshop Multimodal Learning Analytics. Google ScholarDigital Library
Rainer Stiefelhagen, Jie Yang, and Alex Waibel . 2002. Modeling focus of attention for meeting indexing based on multiple cues. Trans. Neural Netw. Vol. 13, 4 (2002), 928--938. Google ScholarDigital Library
Thomas J. L. van Rompay, Dorette J. Vonk, and Marieke L. Fransen . 2009. The eye of the camera: Effects of security cameras on prosocial behavior. Environment and Behavior Vol. 41, 1 (2009), 60--74.Google ScholarCross Ref

Index Terms

Unobtrusive Analysis of Group Interactions without Cameras
1. Applied computing
  1. Law, social and behavioral sciences
    1. Psychology
2. Human-centered computing
  1. Collaborative and social computing
    1. Collaborative and social computing systems and tools

Recommendations

Improved Visual Focus of Attention Estimation and Prosodic Features for Analyzing Group Interactions
ICMI '19: 2019 International Conference on Multimodal Interaction

Collaborative group tasks require efficient and productive verbal and non-verbal interactions among the participants. Studying such interaction patterns could help groups perform more efficiently, but the detection and measurement of human behavior is ...
Read More
A Multimodal-Sensor-Enabled Room for Unobtrusive Group Meeting Analysis
ICMI '18: Proceedings of the 20th ACM International Conference on Multimodal Interaction

Group meetings can suffer from serious problems that undermine performance, including bias, "groupthink", fear of speaking, and unfocused discussion. To better understand these issues, propose interventions, and thus improve team performance, we need to ...
Read More
The unobtrusive group interaction (UGI) corpus
MMSys '19: Proceedings of the 10th ACM Multimedia Systems Conference

Studying group dynamics requires fine-grained spatial and temporal understanding of human behavior. Social psychologists studying human interaction patterns in face-to-face group meetings often find themselves struggling with huge volumes of data that ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
ICMI '18: Proceedings of the 20th ACM International Conference on Multimodal Interaction
October 2018
687 pages
ISBN:9781450356923
DOI:10.1145/3242969
General Chairs:
Sidney K. D'Mello
University of Illinois, USA
,
Panayiotis (Panos) Georgiou
University of Southern California, USA
,
Stefan Scherer
University of Southern California, USA
,
Program Chairs:
Emily Mower Provost
University of Michigan, USA
,
Mohammad Soleymani
University of Southern California, USA
,
Marcelo Worsley
Northwestern University, USA
Copyright © 2018 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 2 October 2018
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
arm and body posture
group meeting analysis
head pose estimation
multimodal sensing
multiple person tracking
smart rooms
time-of-flight sensing
Qualifiers
- research-article
Conference

Acceptance Rates
ICMI '18 Paper Acceptance Rate63of149submissions,42%Overall Acceptance Rate453of1,080submissions,42%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 247
  Total Downloads
- Downloads (Last 12 months)30
- Downloads (Last 6 weeks)9
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Unobtrusive Analysis of Group Interactions without Cameras

ICMI '18: Proceedings of the 20th ACM International Conference on Multimodal Interaction

ABSTRACT

References

Cited By

Index Terms

Recommendations

Improved Visual Focus of Attention Estimation and Prosodic Features for Analyzing Group Interactions

A Multimodal-Sensor-Enabled Room for Unobtrusive Group Meeting Analysis

The unobtrusive group interaction (UGI) corpus

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Unobtrusive Analysis of Group Interactions without Cameras

ICMI '18: Proceedings of the 20th ACM International Conference on Multimodal Interaction

ABSTRACT

References

Cited By

Index Terms

Recommendations

Improved Visual Focus of Attention Estimation and Prosodic Features for Analyzing Group Interactions

A Multimodal-Sensor-Enabled Room for Unobtrusive Group Meeting Analysis

The unobtrusive group interaction (UGI) corpus

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media