Article

Classification of video events using 4-dimensional time-compressed motion features

Authors:

Alexander Haubold,

Milind NaphadeAuthors Info & Claims

CIVR '07: Proceedings of the 6th ACM international conference on Image and video retrieval

Pages 178 - 185

https://doi.org/10.1145/1282280.1282311

Published: 09 July 2007 Publication History

Abstract

Among the various types of semantic concepts modeled, events pose the greatest challenge in terms of computational power needed to represent the event and accuracy that can be achieved in modeling it. We introduce a novel low-level visual feature that summarizes motion in a shot. This feature leverages motion vectors from MPEG-encoded video, and aggregates local motion vectors over time in a matrix, which we refer to as a motion image. The resulting motion image is representative of the overall motion in a video shot, having compressed the temporal dimension while preserving spatial ordering. Building motion models using this feature permits us to combine the power of discriminant modeling with the dynamics of the motion in video shots that cannot be accomplished by building generative models over a time series of motion features from multiple frames in the video shot. Evaluation of models built using several motion image features in the TRECVID 2005 dataset shows that use of this novel motion feature results an average improvement in concept detection performance by 140% over existing motion features. Furthermore, experiments also reveal that when this motion feature is combined with static feature representations of a single keyframe from the shot such as color and texture features, the fused detection results in an improvement between 4 to 12% over the fusion across the static features alone.

References

[1]

Adams, W. H., Amir, A., Dorai, C., Ghoshal, S., Iyengar, G., Jaimes, A., Lang, C. Lin, C. Y., Naphade, M. R., Natsev, A., Neti, C., Nock, H. J., Permutter, H., Singh, R., Srinivasan, S., Smith, J. R., Tseng, B. L., Varadaraju, A. T., and Zhang, D. IBM Research TREC-2002 Video Retrieval System. In Proceedings of the Text Retrieval Conference (TREC) (Gaithersburg, MD, November 2002), NIST Special Publications, SP 500-251, 2002, 289--298.

[2]

Amir, A., Berg, M., Chang, S. F., Iyengar, G., Lin, C., Naphade, M. R., Natsev, A., Neti, C., Nock, H., Hsu, W., Sachdev, I., Smith, J. R., Tseng, B., Wu, Y., and Zhang, D. IBM Research TRECVID-2003 Video Retrieval System. In Proceedings of the TRECVID 2003 Workshop (Gaithersburg, MD, November 2003), NIST Special Publications, 2003.

[3]

Amir, A., Argillander J., Berg, M., Chang, S. F., Iyengar, G., Lin, C., Naphade, M. R., Natsev, A., Hsu, W., Smith, J. R., Tešić, J., Yan, R., Zhang, D. IBM Research TRECVID-2004 Video Retrieval System. In Proceedings of the TRECVID 2004 Workshop (Gaithersburg, MD, November 2004), NIST Special Publications, 2004.

[4]

Amir A., Argillander J., Campbell M., Haubold A., Iyengar G., Ebadollahi S., Kang F., Naphade M. R., Natsev A., Smith J. R., Tešić J., and Volkmer T. IBM Research TRECVID-2005 Video Retrieval System. In Proceedings of the TRECVID 2005 Workshop (Gaithersburg, MD, November 2005), NIST Special Publications, 2005.

[5]

Bresenham, J. E. Algorithm for computer control of a digital plotter. In IBM Systems Journal, Vol. 4 (1), 1965, 25--30.

Digital Library

[6]

Campbell M., Haubold A., Ebadollahi S., Naphade M. R., Natsev P., Smith J. R., Tešić J., and Xie L. IBM Research TRECVID-2006 Video Retrieval System. In Proceedings of the TRECVID 2006 Workshop (Gaithersburg, MD, November 2006), NIST Special Publications, 2006.

[7]

Ewerth, R., Beringer, C., Kopp, T., Nievergall, M., Stadelmann, T., and Freisleben, B. University of Marburg at TRECVID 2005: Shot Boundary Detection and Camera Motion Estimation Results. In Proceedings of the TRECVID 2005 Workshop, NIST Special Publications, Gaithersburg, MD, Nov. 2005.

[8]

Huang, J., Kumar, S., Mitra, M., Zhu, W., and Zabih, R. Spatial Color Indexing and Applications. In International Journal of Computer Vision, Vol. 35 (3), Dec 1999, 245--268.

Digital Library

[9]

Lu, C., Drew, M. S., and Au, J. Classification of Summarized Videos using Hidden Markov Models on Compressed Chromaticity Signatures. In Proceedings of the ACM International Conference on Multimedia (MM '01) (Ottawa, CA, September 30 - October 4, 2001). ACM Press, New York, NY, 479--482.

Digital Library

[10]

Ma, Y.-F., Zhang, H.-J. Motion Pattern-Based Video Classification and Retrieval. In EURASIP Journal on Applied Signal Processing 2003:2, 199--208.

Digital Library

[11]

Naphade, M. R., Huang, M. Discovering Recurrent Events in Video Using Unsupervised Methods. In Proceedings of the International Conference on Image Processing (ICIP '02) (Rochester, NY, September 22--25, 2002), IEEE Press, New York, NY, 2002, II-13--II-16.

[12]

Naphade, M. R., Kennedy, L., Kender, J. R., Chang, S. F., Smith, J. R., Over P., and Hauptmann, A. LSCOM-lite: A Light Scale Concept Ontology for Multimedia Understanding for TRECVID 2005. IBM Research Technical Report, RC23612 (W0505-104), May, 2005.

[13]

Naphade, M. R. and Smith, J. R. On the Semantic Detection of Concepts at TRECVID. In Proc. of the ACM International Conference on Multimedia (MM '04) (New York, NY, October 10--16, 2004), ACM Press, New York, NY, 660--667.

Digital Library

[14]

Naphade, M. R., Wang, R., and Huang, T. S. Supporting audiovisual query using dynamic programming. In Proc. of the ACM International Conference on Multimedia (MM '01) (Ottawa, Canada, September 30--October 4, 2001). ACM Press, New York, NY, 2001, 411--420.

Digital Library

[15]

Snoek, C. G. M., and Worring, M. Multimedia Event-Based Video Indexing using Time Intervals. In IEEE Transactions on Multimedia, 7(4) (Aug. 2005), 638--647.

Digital Library

[16]

J. W. Davis. Hierarchical Motion History Images for Recognizing Human Motion. In Proceedings of the IEEE Workshop on Detection and Recognition of Events in Video (Vancouver, Canada, July 8, 2001). IEEE Press, New York, NY, 2001, 39--46.

[17]

J. C. Niebles, H. Wang, L. Fei-Fei. Unsupervised Learning of Human Action Categories Using Spatial-Temporal Words. In Proceedings of the British Machine Vision Conference (BMVC '06) (Edinburgh, United Kingdom, September 4--7, 2006). British Machine Vision Association, 2001

[18]

L. Zelnik-Manor, M. Irani. Statistical Analysis of Dynamic Actions. In IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 28, No. 9, September 2006. IEEE Press, 1530--1535.

Digital Library

Cited By

Alsafrjalani M(2022)A Framework Model for Integrating Social Media, the Web, and Proprietary Services Into YouTube Video Classification ProcessResearch Anthology on Applying Social Networking Strategies to Classrooms and Libraries10.4018/978-1-6684-7123-4.ch015(260-277)Online publication date: 8-Jul-2022
https://doi.org/10.4018/978-1-6684-7123-4.ch015
Alsafrjalani M(2019)A Framework Model for Integrating Social Media, the Web, and Proprietary Services Into YouTube Video Classification ProcessInternational Journal of Multimedia Data Engineering and Management10.4018/IJMDEM.201904010210:2(21-36)Online publication date: 1-Apr-2019
https://doi.org/10.4018/IJMDEM.2019040102
Liu CDong SLu BAbdel-Mottaleb M(2015)Multimedia event detection with ℓ2-regularized logistic Gaussian mixture regressionNeural Computing and Applications10.1007/s00521-014-1810-y26:7(1561-1574)Online publication date: 1-Oct-2015
https://dl.acm.org/doi/10.1007/s00521-014-1810-y
Show More Cited By

Index Terms

Classification of video events using 4-dimensional time-compressed motion features
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision tasks
        Scene understanding
  2. Machine learning
2. Information systems
  1. Information retrieval
    1. Retrieval models and ranking

Recommendations

Video Indexing Using MPEG Motion Compensation Vectors
ICMCS '99: Proceedings of the 1999 IEEE International Conference on Multimedia Computing and Systems - Volume 02

In the last years a lot of work has been done on color, textural, structural and semantic indexing of "content-based" video databases. Motion-based video indexing has been less explored, with approaches generally based on the analysis of optical flows. ...
Hierarchical Visual Motion Retrieval System and Its Motion Features
3PGCIC '10: Proceedings of the 2010 International Conference on P2P, Parallel, Grid, Cloud and Internet Computing

This paper proposes a hierarchical visual motion retrieval system on the web. To make it possible for the user to retrieve motion data interactively and visually on a computer screen from coarse level to fine level about motion similarity, the proposed ...
Compressed Domain Motion Analysis for Video Semantic Events Detection
ICIE '09: Proceedings of the 2009 WASE International Conference on Information Engineering - Volume 01

In this paper, a novel approach is proposed to estimate camera motion and segment moving objects from compressed video streams, aiming to detect semantic events in video clips. Simultaneously using the motion vectors and DC components of MPEG ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

CIVR '07: Proceedings of the 6th ACM international conference on Image and video retrieval

July 2007

655 pages

ISBN:9781595937339

DOI:10.1145/1282280

General Chairs:
Nicu Sebe
Univ. of Amsterdam, The Netherlands
,
Marcel Worring
Univ. of Amsterdam, The Netherlands

Copyright © 2007 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

In-Cooperation

SIGIR: ACM Special Interest Group on Information Retrieval

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 09 July 2007

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Article

Conference

CIVR07

Sponsor:

SIGMM

CIVR07: International Conference on Image and Video Retrieval 2007

July 9 - 11, 2007

Amsterdam, The Netherlands

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

19
Total Citations
View Citations
392
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 03 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Alsafrjalani M(2022)A Framework Model for Integrating Social Media, the Web, and Proprietary Services Into YouTube Video Classification ProcessResearch Anthology on Applying Social Networking Strategies to Classrooms and Libraries10.4018/978-1-6684-7123-4.ch015(260-277)Online publication date: 8-Jul-2022
https://doi.org/10.4018/978-1-6684-7123-4.ch015
Alsafrjalani M(2019)A Framework Model for Integrating Social Media, the Web, and Proprietary Services Into YouTube Video Classification ProcessInternational Journal of Multimedia Data Engineering and Management10.4018/IJMDEM.201904010210:2(21-36)Online publication date: 1-Apr-2019
https://doi.org/10.4018/IJMDEM.2019040102
Liu CDong SLu BAbdel-Mottaleb M(2015)Multimedia event detection with ℓ2-regularized logistic Gaussian mixture regressionNeural Computing and Applications10.1007/s00521-014-1810-y26:7(1561-1574)Online publication date: 1-Oct-2015
https://dl.acm.org/doi/10.1007/s00521-014-1810-y
Wang FSun ZJiang YNgo C(2014)Video Event Detection Using Motion Relativity and Feature SelectionIEEE Transactions on Multimedia10.1109/TMM.2014.231578016:5(1303-1315)Online publication date: Aug-2014
https://doi.org/10.1109/TMM.2014.2315780
Bhatt CAtrey PKankanhalli M(2013)A reward-and-punishment-based approach for concept detection using adaptive ontology rulesACM Transactions on Multimedia Computing, Communications, and Applications10.1145/2457450.24574529:2(1-21)Online publication date: 10-May-2013
https://dl.acm.org/doi/10.1145/2457450.2457452
Ballan LBertini MBimbo ASeidenari LSerra G(2011)Event detection and recognition for semantic annotation of videoMultimedia Tools and Applications10.1007/s11042-010-0643-751:1(279-302)Online publication date: 1-Jan-2011
https://dl.acm.org/doi/10.1007/s11042-010-0643-7
Tahayna BBelkhatir MAlhashmi SO'Daniel TLi SGao XSebe N(2010)Motion data-driven model for semantic events classification using an optimized support vector machineProceedings of the ACM International Conference on Image and Video Retrieval10.1145/1816041.1816085(296-302)Online publication date: 5-Jul-2010
https://dl.acm.org/doi/10.1145/1816041.1816085
Ulges ASchulze CKoch MBreuel T(2010)Learning automatic concept detectors from online videoComputer Vision and Image Understanding10.1016/j.cviu.2009.08.002114:4(429-438)Online publication date: 1-Apr-2010
https://dl.acm.org/doi/10.1016/j.cviu.2009.08.002
Ballan LBertini MDel Bimbo ASerra G(2010)Video event classification using string kernelsMultimedia Tools and Applications10.1007/s11042-009-0351-348:1(69-87)Online publication date: 1-May-2010
https://dl.acm.org/doi/10.1007/s11042-009-0351-3
Ballan LBertini MDel Bimbo ASerra G(2010)Semantic annotation of soccer videos by visual instance clustering and spatial/temporal reasoning in ontologiesMultimedia Tools and Applications10.1007/s11042-009-0342-448:2(313-337)Online publication date: 1-Jun-2010
https://dl.acm.org/doi/10.1007/s11042-009-0342-4
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten