skip to main content
10.1145/1290128.1290141acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections

Multimedia and human-in-the-loop: interaction as content enrichment

Published: 28 September 2007 Publication History


The current work is part of the broadband visual research program at the Institute for Information Technology (National Research Council Canada). The research program is currently focused on developing human-centered multimedia technology to support large group visual communication and collaboration. This paper outlines some conceptual foundations for the development of a human-centered multimedia research tool to capture interaction data, which could be linked to users cognitive processing. The approach is based on the notion of multimedia interaction as content enrichment and on cognitive modeling methodology.


A. James and N. Dimitrova, "Human-centered multimedia: culture, deployment, and access," IEEE Multimedia Magazine, vol. 13, pp. 12--19, 2006.
A. James, M. Christel, S. Gilles, R. Sarukkai, and W.-Y. Ma, "Multimedia information retrieval: What is it, and why isn't anyone using it?," in Proceedings of the 7th ACM SIGMM international workshop on Multimedia information retrieval. New York, NY, USA: ACM Press, 2005, pp. 3--8.
A. Newell, Unified theories of Cognition. Cambridge, MA: Harvard University Press, 1990.
A. Jaimes, N. Sebe, and D. Gatica-Perez, "Human--centered computing: a multimedia perspective," in Proceedings of the 14th annual ACM international conference on Multimedia, 2006, pp. 855--864.
C. S. Peirce, "On a new list of categories," Proceedings of the American Academy of Arts and Sciences, vol. 7, 1868.
B. Emond, "Pour une grammaire formelle de la classification des signes chez Charles S. Peirce," Semiotica, vol. 72, 1988.
J. Hunter and J. Newmarch, "An indexing, browsing, search and retrieval system for audiovisual libraries," in Lecture Notes in Computer Science, vol. 1696, 1999, pp. 76--91.
D. Zhang and J. F. Nunamaker, "A natural language approach to content--based indexing and retreival for interactive e-learning.," IEEE Transactions on Multimedia, vol. 6, pp. 450--458, 2004.
C. G. M. Snoek and M. Worring, "Multimodal video indexing: a review of the state-of-the-art," Multimedia Tools and Applications, vol. 25, pp. 5--35, 2005.
D. O. Gorodnichy and A. Yogeswaran, "Detection and tracking of pianist hands and fingers," presented at Proc. of the Canadian conference Computer & Robot Vision (CRV'06), Quèèbec, Canada, 2006.
M. M. Yeung and B. L. Yeo, "Video visualization for compact presentation and fast browsing of picturial content," IEEE Transactions on Circuits Systems and Video Technology, vol. 7, pp. 771--785, 1997.
G. Comeau, M. Brooks, and J. Spence, "Video and broadband video conference in professional development," presented at Biennial conference of the International Consortium for Educational Development (ICED), University of Ottawa, Ottawa, Ontario., 2004.
J. W. Stigler, P. Gonzales, T. Kawanaka, S. Knoll, and A. Serano, "The TIMSS videotape classroom study: methods and findings from an exploratory research project on eighth-grade mathematics instruction in Germany, Japan, and the United States," National Center for Education Statistics Report, US Department of Education, Washington 1999.
I. I. JTC1/SC29/WG11, "MPEG-4 Overview--(V.21-Jeju Version)," 2002.
International_MIDI_Association, MIDI Musical Instrument Digitial Interface Specification 1.0. Los Angeles, CA: International MIDI Association, 1983.
International_MIDI_Association, Standard MIDI Files 1.0. Los Angeles, CA: International MIDI Association, 1988.
Dublin_Core_Metadata_Initiative, "Dublin Core Metadata Element Set, Version 1.1: Reference Description," 2004.
C. S. L. T. S. Committee, "IEEE Standard for Learning Object Metadata (1484.12.1-2002)," IEEE Learning Technology Standards Committee, 2002.
World_Wide_Web_Consortium, "Synchronized Multimedia Integration Language (SMIL 2.0)," 2005.
J. Brook, "The Development of an Asynchronous Video Reference Database to Support Suzuki Piano Instruction," presented at Graduate Music Students' Association (GMSA) Conference, University of Ottawa, Ottawa, 2006.
B. Emond, M. Brooks, and A. Smith, "A broadband Web-based Application for Video Sharing and Annotation," in Proceedings of Ninth International ACM Conference on Multimedia, 2001, pp. 603--604.
B. Emond and M. Brooks, "The Private Video Sharing and Annotation Server: A broadband application for teacher training and music education," presented at International Lisp Conference, New York, NY, 2003.
H. Masum, M. Brooks, and J. Spence, "MusicGrid: A case study in broadband video collaboration," First Monday, vol. 10, 2005.
B. Emond and R. L. West, "Cyberpsychology: A human-interaction perspective based on cognitive modelling," Cyberpsychology and Behavior, vol. 6, pp. 527--536, 2003.
A. Pentland, "Socially aware computation and communication," Computer, vol. 38, pp. 33--40, 2005.
I. McCowan, D. Gatica-Perez, S. Bengio, G. Lathoud, M. Barnard, and D. Zhang, "Automatic analysis of multimodal group actions in meetings," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 27, pp. 305--317, 2005.
B. Emond, N. G. Vinson, J. A. Singer, M. A. Barfurth, and M. Brooks, "ReView: A Digital Video Player to Support Music Practice and Learning," Journal of Technology in Music Learning, Accepted.
G. Geisler, G. Marchionini, B. M. Wildemuth, A. Hughes, M. Yang, T. Wilkens, and R. Spinks, "Video browsing interfaces for the open video project," in CHI '02 extended abstracts on Human factors in computer systems, 2002, pp. 514--515.
G. Geisler, S. Giersch, D. McArthur, and M. McClelland, "Creating virtual collections in digital libraries: benefits and implementation issues," in Proceedings of the second ACM/IEEE-CS joint conference on Digital libraries, 2002, pp. 210--218.
C.-W. Lin, J. Zhou, J. Youn, and S. M.-T., "MPEG video streaming with vcr functionality," IEEE Transactions on Circuits and Systems for Video Technology, vol. 11, pp. 415--425, 2001.
F. C. Li, A. Gupta, A. Sanocki, L. He, and Y. Rui, "Browsing digital video," presented at Paper presented at the ACM Conference on Human Factors in Computing Systems, 2000.
C. Drake and C. Palmer, "Skill acquisition in music performance: Relations between planning and temporal control," Cognition, vol. 74, pp. 1--32, 2000.
G. Mantovani, "The psychological construction of the internet; from information foraging to social gathering to cultural mediation," Cyberpsychology and behavior, vol. 4, pp. 47--56, 2001.
P. Pirolli and S. K. Card, "Information foraging," Psychological Review, vol. 106, pp. 643--675, 1999.
E. Wenger, Communities of practices. Learning, meaning, and identity. Cambridge: Cambridge University Press, 1998.
G. Riva and C. Galimberti, "The mind in the web: Psychology in the internet age," Cyberpsychology and behavior, vol. 4, pp. 1--5, 2001.
M. A. Barfurth, J. Singer, B. Emond, N. Vinson, M. Brooks, and J. Spence, "Evaluation factors for multi-stakeholder broadband visual communication projects," presented at Eleventh IEEE International Workshops on Enabling Technologies: Infrastrcutures for Collaborative Entreprises-WET-ICE, 2002.
C. Galimberti, S. Ignazi, P. Vercesi, and G. Riva, "Communication and cooperation in networked environments: an experimental analysis," Cyberpsychology and behavior, vol. 4, pp. 131--146, 2001.
G. Riva, "The mind over the web: the quest for the definition of a method for internet research," Cyberpsychology and behavior, vol. 4, pp. 7--16, 2001.
C. Goldspink, "Methodological implications of complex systems approaches to sociality: simulation as a foundation for knowledge," Journal of Artificial Societies and Social Simulation, vol. 5, 2002.
J. M. Epstein and R. Axtell, Growing Artificial Societies. Cambridge, MA: MIT Press, 1996.
M. Minsky, The Society of Mind. New York: Simon and Schuster, 1986.
C. Goldspink, "Modelling social systems as complex: towards a social simulation meta-model," Journal of Artificial Societies and Social Simulation, vol. 3, 2000.
R. Conte and M. Paolucci, "Intelligent social learning," Journal of Artificial Societies and Social Simulation, vol. 4, 2001.
M. D. Byrne and J. R. Anderson, "Serial modules in parallel: The psychological refractory period and perfect time-sharing," Psychological Review, vol. 108, pp. 847--869, 2001.
J. R. Anderson and M. P. Matessa, "A production system theory of serial memory," Psychological Review, vol. 104, pp. 728--748, 1997.
J. R. Anderson and L. M. Reder, "The fan effect: New results and new theories," Journal of Experimental Psychology: General, vol. 128, pp. 186--197, 1999.
J. R. Anderson, D. Bothell, C. Lebiere, and M. P. Matessa, "An integrated theory of list memory," Journal of Memory and Language, pp. 341--380, 1998.
E. M. Altmann and J. G. Trafton, "Memory for goals: An activation--based model," Cognitive Science, vol. 26, pp. 39--83, 2002.
B. D. Ehret, W. D. Gray, and S. S. Kirschenbaum, "Contending with complexity: Developing and using a scaled world in applied cognitive research," Human Factors, vol. 42, pp. 8--23, 2000.
J. R. Anderson, R. Budiu, and L. M. Reder, "A theory of sentence memory as part of a general theory of memory," Journal of Memory and Language, vol. 45, pp. 337--367, 2001.
B. Emond, "Modeling natural language comprehension and anaphora resolution with ACT-R," presented at Fourth Annual ACT-R Workshop, Pittsburgh, PA, 1997.
J. Leitão, "An ACT-R Model of Syllogistic Reasoning," presented at Fourth International Conference on Cognitive Modeling, 2001.
B. Emond, "Cognitive representations and processes in syllogistic reasoning: existential graphs and cognitive modelling," Psychologica, vol. 32, pp. 311--340, 2003.
R. L. West and C. Lebiere, "Simple games as dynamic, coupled systems: Randomness and other emergent properties," Cognitive Systems Research, vol. 1, pp. 221--239, 2001.
J. R. Anderson, Y. Qin, M.-H. Sohn, V. A. Stenger, and C. S. Carter, "An information-processing model of the BOLD response in symbol manipulation tasks," Psychonomic Bulletin and Review, in press.
M. P. Matessa and J. R. Anderson, "Modeling Focused Learning in Role Assignment," Language and Cognitive Processes, vol. 15, pp. 263--292, 2000.
D. Fum and F. Del Missier, "Modeling counter offer behavior in dyadic distributive negotiation," in Proceedings of the Fourth International Conference on Cognitive Modeling: Lawrence Erlbaum Associates, 2001, pp. 79--84.
C. Lebiere, "Modeling group decision making in the ACT-R cognitive architecture," presented at Computational Social and Organizational Science (CASOS), Pittsburgh, PA, 2002.
J. R. Anderson, "Spanning seven orders of magnitude: a challenge for cognitive modeling," Cognitive Science, vol. 26, pp. 85--112, 2002.
J. R. Anderson, The architecture of cognition. Cambridge, MA, USA: Harvard University Press, 1983.
J. R. Anderson and C. Lebiere, The Atomic Components of Thought. Mahwah, NJ, USA: Lawrence Erlbaum Associates, 1998.
Z. W. Pylyshyn, Computation and cognition: toward a foundation for cognitive science. Cambridge, MA, USA: MIT Press, 1984.
D. E. Kieras and D. E. Meyer, "An overview of the EPIC architecture for cognition and performance with application to human-computer interaction," Human-Computer Interaction, vol. 12, 1997.
W. Kintsch, Comprehension. Cambridge, UK: Cambridge University Press, 1998.
W. D. Gray and E. M. Altmann, "Cognitive modeling and human-computer interaction," in International encyclopedia of ergonomics and human factors, vol. 1, W. Karwowski, Ed., 2001, pp. 387--391.
M. D. Byrne, "ACT-R/PM and menu selection: Applying a cognitive architecture to HCI," International Journal of Human-Computer Studies, vol. 55, pp. 41--84, 2001.
B. Emond and R. L. West, "Using cognitive modelling simulations for user interface design decisions," in Innovations in Applied Artificial Intelligence: Proceedings of the 17th International Conference on Industrial & Engineering Applications of Artificial Intelligence & Expert Systems, B. Orchard, C. Yang, and M. Ali, Eds. Berlin: Springer-Verlag, 2004, pp. 305--314.

Cited By

View all
  • (2020)Construction of Diverse Image Datasets From Web Collections With Limited LabelingIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2019.289889930:4(1147-1161)Online publication date: Apr-2020
  • (2008)An H.323 Broadband Virtual Camera for supporting asynchronous visual communication in large groups2008 IEEE International Symposium on Technology and Society10.1109/ISTAS.2008.4559765(1-4)Online publication date: Jun-2008



Information & Contributors


Published In

cover image ACM Conferences
HCM '07: Proceedings of the international workshop on Human-centered multimedia
September 2007
112 pages
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]



Association for Computing Machinery

New York, NY, United States

Publication History

Published: 28 September 2007


Request permissions for this article.

Check for updates

Author Tags

  1. cognitive modeling
  2. context
  3. human interaction modeling from multimedia
  4. task modeling in multimedia systems
  5. unified theories of cognition
  6. user


  • Article


MM07: The 15th ACM International Conference on Multimedia 2007
September 28, 2007
Bavaria, Augsburg, Germany


Other Metrics

Bibliometrics & Citations


Article Metrics

  • Downloads (Last 12 months)3
  • Downloads (Last 6 weeks)0
Reflects downloads up to 08 Mar 2025

Other Metrics


Cited By

View all
  • (2020)Construction of Diverse Image Datasets From Web Collections With Limited LabelingIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2019.289889930:4(1147-1161)Online publication date: Apr-2020
  • (2008)An H.323 Broadband Virtual Camera for supporting asynchronous visual communication in large groups2008 IEEE International Symposium on Technology and Society10.1109/ISTAS.2008.4559765(1-4)Online publication date: Jun-2008

View Options

Login options

View options


View or Download as a PDF file.



View online with eReader.







Share this Publication link

Share on social media