skip to main content
10.1145/1452392.1452424acmconferencesArticle/Chapter ViewAbstractPublication Pagesicmi-mlmiConference Proceedingsconference-collections
research-article

MultiML: a general purpose representation language for multimodal human utterances

Published: 20 October 2008 Publication History

Abstract

We present MultiML, a markup language for the annotation of multimodal human utterances. MultiML is able to represent input from several modalities, as well as the relationships between these modalities. Since MultiML separates general parts of representation from more context-specific aspects, it can easily be adapted for use in a wide range of contexts. This paper demonstrates how speech and gestures are described with MultiML, showing the principles - including hierarchy and underspecification - that ensure the quality and extensibility of MultiML. As a proof of concept, we show how MultiML is used to annotate a sample human-robot interaction in the domain of a multimodal joint-action scenario.

References

[1]
A. Kranstedt, S. Kopp, and I. Wachsmuth "Murml: A multimodal utterance representation markup language for conversational agents," in Proc. of the AAMAS Workshop on "Embodied conversational agents Let's specify and evaluate them", 2002.
[2]
S. Prillwitz, R. Leven, H. Zienert, T. Hanke, and J. Henning, HamNoSys. Version 2.0; Hamburger Notationssystem für Gebärdensprache. Eine Einführung. Hamburg: Signum, 1989.
[3]
F. Landragin, A. Denis, A. Ricci, and L. Romary, "Multimodal meaning representation for generic dialogue systems architectures," in Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC 2004), 2004, pp. 521--524.
[4]
D. Gibbon, U. Gut, B. Hell, K. Looks, and A. T. T. Trippel, "A computational model of arm gestures in conversation," in EUROSPEECH-2003, 2003, pp. 813--816.
[5]
J. Y. Chai, "Semantics-based representation for multimodal interpretation in conversational systems," in Proceedings of the 19th international conference on Computational Linguistics, vol. 1. Association for Computational Linguistics Morristown, NJ, USA, 2002, pp. 1--7.
[6]
M. E. Foster, T. By, M. Rickert, and A. Knoll, "Human-robot dialogue for joint construction tasks," in Proceedings, Eighth International Conference on Multimodal Interfaces (ICMI 2006), Banff, November 2006.
[7]
T. Müller, P. Ziaie, and A. Knoll, "A wait-free realtime system for optimal distribution of vision tasks on multicore architectures," in Proc. 5th International Conference on Informatics in Control, Automation and Robotics, May 2008.
[8]
P. Ziaie, T. Müller, M. E. Foster, and A. Knoll, "Using a naïve Bayes classifier based on k-nearest neighbors with distance weighting for static hand-gesture recognition in a human-robot dialog system," in Proceedings of CSICC 2008, Kish Island, Iran, Mar. 2008.
[9]
M. Henning, "A new approach to object-oriented middleware," IEEE Computer Society, vol. 8, no. 1, pp. 66--75, Jan-Feb 2004.
[10]
M. Giuliani and A. Knoll, "Integrating multimodal cues using grammar based models," in Proceedings of HCI International 2007, Beijing, China, July 2007, pp. 858--867.
[11]
M. Rickert, M. E. Foster, M. Giuliani, T. By, G. Panin, and A. Knoll, "Integrating language, vision and action for human robot dialog systems," in Proceedings of the International Conference on Human-Computer Interaction, C. Stephanidis, Ed. Beijing: Springer, July 2007, pp. 987--995.
[12]
J. F. Allen and G. Ferguson, "Actions and events in interval temporal logic," Journal of Logic and Computation, vol. 4, pp. 531--579, 1994.
[13]
A. E. Ades and M. J. Steedman, "On the order of words," Linguistics and philosophy, vol. 4, pp. 517--558, 1982.
[14]
M. Steedman, The syntactic process. Cambridge, MA, USA: MIT Press, 2000.
[15]
K. Ajdukiewicz, "Die syntaktische Konnexität," Studia Philosophica, vol. 1, pp. 1--27, 1935.
[16]
Y. Bar-Hillel, "A quasi-arithmetic notation for syntactic description," Language, vol. 29, pp. 47--58, 1953.
[17]
M. White, "Effcient realization of coordinate structures in combinatory categorial grammar," Research on Language & Computation, vol. 4, no. 1, pp. 39--75, 2006.
[18]
J. Baldridge and G.-J. Kruijff, "Coupling ccg and hybrid logic dependency semantics," in Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics (ACL 02), Philadelphia, PA: University of Pennsylania, 2002.
[19]
D. McNeill, Hand and mind: What gestures reveal about thought. Chicago: University of Chicago Press, 1992.
[20]
A. Kendon, Gesture: Visible Action as Utterance. Cambridge University Press, 2004.

Cited By

View all
  • (2022)Towards Situated AMR: Creating a Corpus of Gesture AMRDigital Human Modeling and Applications in Health, Safety, Ergonomics and Risk Management. Health, Operations Management, and Design10.1007/978-3-031-06018-2_21(293-312)Online publication date: 16-Jun-2022
  • (2018)Applications in HHI: Physical CooperationHumanoid Robotics: A Reference10.1007/978-94-007-6046-2_129(2221-2259)Online publication date: 10-Oct-2018
  • (2017)DICE-RProceedings of the ACM SIGCHI Symposium on Engineering Interactive Computing Systems10.1145/3102113.3102147(117-122)Online publication date: 26-Jun-2017
  • Show More Cited By

Index Terms

  1. MultiML: a general purpose representation language for multimodal human utterances

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    ICMI '08: Proceedings of the 10th international conference on Multimodal interfaces
    October 2008
    322 pages
    ISBN:9781605581989
    DOI:10.1145/1452392
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 20 October 2008

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. human-robot interaction
    2. multimodal
    3. representation

    Qualifiers

    • Research-article

    Conference

    ICMI '08
    Sponsor:
    ICMI '08: INTERNATIONAL CONFERENCE ON MULTIMODAL INTERFACES
    October 20 - 22, 2008
    Crete, Chania, Greece

    Acceptance Rates

    Overall Acceptance Rate 453 of 1,080 submissions, 42%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 13 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2022)Towards Situated AMR: Creating a Corpus of Gesture AMRDigital Human Modeling and Applications in Health, Safety, Ergonomics and Risk Management. Health, Operations Management, and Design10.1007/978-3-031-06018-2_21(293-312)Online publication date: 16-Jun-2022
    • (2018)Applications in HHI: Physical CooperationHumanoid Robotics: A Reference10.1007/978-94-007-6046-2_129(2221-2259)Online publication date: 10-Oct-2018
    • (2017)DICE-RProceedings of the ACM SIGCHI Symposium on Engineering Interactive Computing Systems10.1145/3102113.3102147(117-122)Online publication date: 26-Jun-2017
    • (2017)Applications in HHI: Physical CooperationHumanoid Robotics: A Reference10.1007/978-94-007-7194-9_129-1(1-39)Online publication date: 10-Oct-2017
    • (2011)Cognitive Memory for Semantic Agents Architecture in Robotic InteractionInternational Journal of Cognitive Informatics and Natural Intelligence10.4018/jcini.20110101035:1(43-58)Online publication date: Jan-2011
    • (2010)Situated reference in a hybrid human-robot interaction systemProceedings of the 6th International Natural Language Generation Conference10.5555/1873738.1873749(67-75)Online publication date: 7-Jul-2010
    • (2009)Evaluating description and reference strategies in a cooperative human-robot dialogue systemProceedings of the 21st International Joint Conference on Artificial Intelligence10.5555/1661445.1661737(1818-1823)Online publication date: 11-Jul-2009
    • (2009)Towards a Modeling Language for Designing Auditory InterfacesProceedings of the 5th International Conference on Universal Access in Human-Computer Interaction. Part III: Applications and Services10.1007/978-3-642-02713-0_53(502-511)Online publication date: 14-Jul-2009

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media