ACM Home Page
Please provide us with feedback. Feedback
Extracting information from multimedia meeting collections
Full text PdfPdf (269 KB)
Source International Multimedia Conference archive
Proceedings of the 7th ACM SIGMM international workshop on Multimedia information retrieval table of contents
Hilton, Singapore
SESSION: Special session 2: multimedia information retrieval: challenges and real-world applications table of contents
Pages: 245 - 252  
Year of Publication: 2005
ISBN:1-59593-244-5
Authors
Daniel Gatica-Perez  IDIAP Research Institute, Martigny, Switzerland
Dong Zhang  IDIAP Research Institute, Martigny, Switzerland
Samy Bengio  IDIAP Research Institute, Martigny, Switzerland
Sponsors
SIGMULTIMEDIA: ACM Special Interest Group on Multimedia
ACM: Association for Computing Machinery
SIGGRAPH: ACM Special Interest Group on Computer Graphics and Interactive Techniques
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 8,   Downloads (12 Months): 70,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
Save this Article to a Binder    Display Formats: BibTex  EndNote ACM Ref   
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1101826.1101865
What is a DOI?

ABSTRACT

Multimedia meeting collections, composed of unedited audio and video streams, handwritten notes, slides, and electronic documents that jointly constitute a raw record of complex human interaction processes in the workplace, have attracted interest due to the increasing feasibility of recording them in large quantities, by the opportunities for information access and retrieval applications derived from the automatic extraction of relevant meeting information, and by the challenges that the extraction of semantic information from real human activities entails. In this paper, we present a succint overview of recent approaches in this field, largely influenced by our own experiences. We first review some of the existing and potential needs for users of multimedia meeting information systems. We then summarize recent work on various research areas addressing some of these requirements. In more detail, we describe our work on automatic analysis of human interaction patterns from audio-visual sensors, discussing open issues in this domain.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
M. Al-Hames, A. Dielmann, D. Gatica-Perez, S. Reiter, S. Renals, G. Rigoll, and D. Zhang, "Multimodal Integration for Meeting Group Action Segmentation and Recognition," in Proc. Workshop on Machine Learning for Multimodal Interaction (MLMI), Edinburgh, Jul. 2005.
 
2
J. Ang, Y. Liu, and E. Shriberg, "Automatic dialog act segmentation and classification in multiparty meetings," in Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP), Philadelphia, Mar. 2005.
 
3
S. Bengio, "An asynchronous Hidden Markov Model for audio-visual speech recognition," in Proc. Advances in Neural Information Processing Systems, (NIPS 15), Vancouver, Dec. 2002.
 
4
S. Bengio and J. Mariethoz, "The expected performance curve: a new assessment measure for person authentication," in Proc. Odyssey, Toledo, May 2004.
 
5
S. Bengio and H. Bourlard, "Multi channel sequence processing," in Proc. PASCAL Machine Learning Workshop, Sheffield, Sep. 2004.
 
6
R.F. Bales, Interaction Process Analysis: a method for the study of small groups, Addison-Wesley, 1951.
 
7
A. H. Buist, W. Kraaij, and S. Raaijmakers, "Automatic summarization of meeting data: A feasibility study," in Proc. Meeting of Computational Linguistics in the Netherlands (CLIN), Leiden, Dec. 2004.
 
8
S. Burger, V. MacLaren, and H. Yu, "The ISL meeting corpus: The impact of meeting type on speech style," in Proc. ICSLP, Denver, Sep. 2002.
 
9
J. Carletta, S. Ashby, S. Bourban, M. Flynn, M. Guillemot, T. Hain, J. Kadlec, V. Karaiskos, W. Kraaij, M. Kronenthal, G. Lathoud, M. Lincoln, A. Lisowska, I. McCowan, W. Post, D. Reidsma, and P. Wellner, "The AMI meeting corpus: A pre-announcement," in Proc. Workshop on Machine Learning for Multimodal Interaction (MLMI), Edinburgh, Jul. 2005.
 
10
L. Chen, R. Travis~Rose, F. Parrill, X. Han, J. Tu, Z. Huang, M. Harper, F. Quek, D. McNeill, R. Tuttle, and T. Huang, "VACE multimodal meeting corpus," in Proc. Workshop on Machine Learning for Multimodal Interaction (MLMI), Edinburgh, Jul. 2005.
 
11
A. Cremers and B. Hilhorst, "What was discussed by whom, how, when and where? Personalized browsing of annotated multimedia meeting recordings," in Proc. Int. Conf. on Human-Computer Interaction (HCI International), Las Vegas, Jul. 2005.
 
12
S. Dupont and J. Luettin, "Audio-visual speech modeling for continuous speech recognition," IEEE Trans. on Multimedia, vol. 2, no. 3, pp. 141--151, Sep. 2000.
 
13
B. Erol and Y. Li, "An overview of technologies for e-meeting and e-lecture," in Proc. IEEE Int. Conf. on Multimedia and Expo (ICME), Amsterdam, Jul. 2005.
14
 
15
D. Gatica-Perez, I. McCowan, D. Zhang, and S. Bengio, "Detecting group interest-level in meetings," in Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP), Philadelphia, Mar. 2005.
16
 
17
J.L. Gauvain and C.-H. Lee, "Maximum a posteriori estimation for multivariate Gaussian mixture obervation of Markov chains," IEEE Trans. on Speech and Audio Processing, vol. 2, pp. 290--298, 1994.
18
 
19
A. Janin, D. Baron, J. Edwards, D. Ellis, D. Gelbart, N. Morgan, B. Peskin, T. Pfau, E. Shriberg, A. Stolcke, and C. Wooters, "The ICSI meeting corpus," in Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP), Hong-Kong, Apr. 2003.
 
20
G. Ji and J. Bilmes, "Dialog act tagging using graphical models," in Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP), Philadelphia, Mar. 2005.
 
21
N. Jovanovic and R. op den Akker, "Towards automatic addressee identification in multi-party dialogues," in Proc. SIGDial Workshop on Discourse and Dialogue, Boston, Apr. 2004.
 
22
L. Kennedy and D. Ellis, "Pitch-based emphasis detection for characterization of meeting recordings," in Proc. IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), Virgin Islands, Dec. 2003.
 
23
A. Lisowska, A. Popescu-Belis, and S. Armstrong, "User query analysis for the specification and evaluation of a dialogue processing and retrieval system," in Proc. Int. Conf. on Language Resources and Evaluation (LREC), Lisbon, May 2004.
 
24
 
25
J.E. McGrath, Groups: Interaction and Performance, Prentice-Hall, 1984.
26
27
 
28
 
29
G. Murray, S. Renals, and J. Carletta, "Extractive summarization of meeting recordings," in Proc. European Conf. on Speech Communication and Technology (Eurospeech), Lisbon, Sep. 2005.
 
30
A. Popescu-Belis and D. Lalanne, "Detection and resolution of references to meeting documents," in Proc. Workshop on Machine Learning for Multimodal Interaction (MLMI), Edinburgh, Jul. 2005.
 
31
S. Renals and D. Ellis, "Audio information access from meeting rooms," in Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP), Hong Kong, 2003.
 
32
R. Rienks and D. Heylen, "Automatic dominance detection in meetings using support vector machines," in Proc. Workshop on Machine Learning for Multimodal Interaction (MLMI), Edinburgh, Jul. 2005.
 
33
E. Shriberg, R. Dhillon, S. Bhagat, J. Ang, and H. Carvey, "The ICSI meeting recorder dialog act (MRDA) corpus," in Proc. HLT-NAACL SIGDIAL Workshop, Boston, Apr. 2004.
 
34
E. Shriberg, "Spontaneous speech: How people really talk and why engineers should care," in Proc. European Conf. on Speech Communication and Technology (Eurospeech), Lisbon, Sep. 2005.
 
35
V. Stanford, J. Garofolo, and M. Michel, "The nist smart space and meeting room projects: Signals, acquisition, annotation, and metrics," in Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP), Hong Kong, 2003.
36
 
37
R. Stiefelhagen, J. Yang, and A. Waibel, "Modeling focus of attention for meeting indexing based on multiple cues," IEEE IEEE Trans. on Neural Networks, vol. 13, no. 4, pp. 928--938, 2002.
 
38
S. Tucker and S. Whittaker, "Accessing multimodal meeting data: Systems, problems and possibilities," in Proc. Workshop on Machine Learning for Multimodal Interaction (MLMI), Martigny, Jun. 2004.
 
39
A. Vinciarelli and J.-M. Odobez, "Application of information retrieval techniques to presentation slides," IEEE Trans. on Multimedia, 2005, in press.
 
40
A. Waibel, M. Bett, F. Metze, K. Ries, T. Schaaf, T. Schultz, H. Soltau, H. Yu, and K. Zechner, "Advances in automatic meeting record creation and access," in Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP), Salt Lake City, May 2001.
41
 
42
S. Whittaker, R. Laban, and S. Tucker, "Analysing meeting records: an ethnographic study and technological implications," in Proc. Workshop on Machine Learning for Multimodal Interaction (MLMI), Edinburgh, Jul. 2005.
 
43
B. Wrede and E. Shriberg, "Spotting hotspots in meetings: Human judgments and prosodic cues," in Proc. European Conf. on Speech Communication and Technology (Eurospeech), Geneva, Sep. 2003.
 
44
B. Wrede and E. Shriberg, "The relationship between dialogue acts and hot spots in meetings," in Proc. IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), Virgin Islands, Dec. 2003.
 
45
46
 
47
D. Zhang, D. Gatica-Perez, S. Bengio, and I. McCowan, "Modeling individual and group actions in meetings with layered HMMs," IEEE Trans. on Multimedia, 2005, in press.
 
48
D. Zhang, D. Gatica-Perez, S. Bengio, and D. Roy, "Learning Influence among Interacting Markov Chains," in Proc. Advances in Neural Information Processing Systems (NIPS 18), Vancouver, Dec. 2005.
 
49
Augmented Multi-Party Interaction~(AMI) project, www.amiproject.org.
 
50
Interactive Multimodal Information Management~(IM2) project, www.im2.ch.
 
51
MultiModal Meeting Manager~(M4) project, www.m4project.org.
 
52
AMI project, "Use cases and user requirements," Public deliverable D6.2, Apr. 2005.
 
53
NIST, Proc. Rich Transcription 2005 Spring Meeting Recognition Evaluation Workshop, Edinburgh, Jul. 2005.

Collaborative Colleagues:
Daniel Gatica-Perez: colleagues
Dong Zhang: colleagues
Samy Bengio: colleagues