ABSTRACT
We describe in this paper our approach for the Multi-modal gesture recognition challenge organized by ChaLearn in conjunction with the ICMI 2013 conference. The competition's task was to learn a vocabulary of 20 types of Italian gestures performed from different persons and to detect them in sequences. We develop an algorithm to find the gesture intervals in the audio data, extract audio features from those intervals and train two different models. We engineer features from the skeleton data and use the gesture intervals in the training data to train a model that we afterwards apply to the test sequences using a sliding window. We combine the models through weighted averaging. We find that this way to combine information from two different sources boosts the models performance significantly.
- L. Breiman. Random forests. Machine Learning, 45(1):5--32, 2001. Google ScholarDigital Library
- P. Davis, S. Mermelstein. Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. volume 28, pages 357--366, 1980.Google Scholar
- J. H. Friedman. Greedy function approximation: A gradient boosting machine. Annals of Statistics, 29:1189--1232, 2000.Google ScholarCross Ref
- P. Geurts, D. Ernst, and L. Wehenkel. Extremely randomized trees. Machine learning, 63(1):3--42, 2006. Google ScholarDigital Library
- S. J. Pan and Q. Yang. A survey on transfer learning. Knowledge and Data Engineering, IEEE Transactions on, 22(10):1345--1359, 2010. Google ScholarDigital Library
- F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay. Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12:2825--2830, 2011. Google ScholarDigital Library
Index Terms
- A multi modal approach to gesture recognition from audio and video data
Recommendations
A multi-modal gesture recognition system using audio, video, and skeletal joint data
ICMI '13: Proceedings of the 15th ACM on International conference on multimodal interactionThis paper describes the gesture recognition system developed by the Institute for Infocomm Research (I2R) for the 2013 ICMI CHALEARN Multi-modal Gesture Recognition Challenge. The proposed system adopts a multi-modal approach for detecting as well as ...
Multi-scenario gesture recognition using Kinect
CGAMES '12: Proceedings of the 2012 17th International Conference on Computer Games: AI, Animation, Mobile, Interactive Multimedia, Educational & Serious Games (CGAMES)Hand gesture recognition (HGR) is an important research topic because some situations require silent communication with sign languages. Computational HGR systems assist silent communication, and help people learn a sign language. In this article, a ...
Non-audio–Video Gesture Recognition Systems
AbstractGesture recognition as a topic in computer science and language technology has the goal of interpreting human gestures via mathematical algorithms. Gestures can originate from any bodily motion or state but commonly originate from the face or ...
Comments