skip to main content
10.1145/1282280.1282311acmconferencesArticle/Chapter ViewAbstractPublication PagescivrConference Proceedingsconference-collections
Article

Classification of video events using 4-dimensional time-compressed motion features

Published: 09 July 2007 Publication History

Abstract

Among the various types of semantic concepts modeled, events pose the greatest challenge in terms of computational power needed to represent the event and accuracy that can be achieved in modeling it. We introduce a novel low-level visual feature that summarizes motion in a shot. This feature leverages motion vectors from MPEG-encoded video, and aggregates local motion vectors over time in a matrix, which we refer to as a motion image. The resulting motion image is representative of the overall motion in a video shot, having compressed the temporal dimension while preserving spatial ordering. Building motion models using this feature permits us to combine the power of discriminant modeling with the dynamics of the motion in video shots that cannot be accomplished by building generative models over a time series of motion features from multiple frames in the video shot. Evaluation of models built using several motion image features in the TRECVID 2005 dataset shows that use of this novel motion feature results an average improvement in concept detection performance by 140% over existing motion features. Furthermore, experiments also reveal that when this motion feature is combined with static feature representations of a single keyframe from the shot such as color and texture features, the fused detection results in an improvement between 4 to 12% over the fusion across the static features alone.

References

[1]
Adams, W. H., Amir, A., Dorai, C., Ghoshal, S., Iyengar, G., Jaimes, A., Lang, C. Lin, C. Y., Naphade, M. R., Natsev, A., Neti, C., Nock, H. J., Permutter, H., Singh, R., Srinivasan, S., Smith, J. R., Tseng, B. L., Varadaraju, A. T., and Zhang, D. IBM Research TREC-2002 Video Retrieval System. In Proceedings of the Text Retrieval Conference (TREC) (Gaithersburg, MD, November 2002), NIST Special Publications, SP 500-251, 2002, 289--298.
[2]
Amir, A., Berg, M., Chang, S. F., Iyengar, G., Lin, C., Naphade, M. R., Natsev, A., Neti, C., Nock, H., Hsu, W., Sachdev, I., Smith, J. R., Tseng, B., Wu, Y., and Zhang, D. IBM Research TRECVID-2003 Video Retrieval System. In Proceedings of the TRECVID 2003 Workshop (Gaithersburg, MD, November 2003), NIST Special Publications, 2003.
[3]
Amir, A., Argillander J., Berg, M., Chang, S. F., Iyengar, G., Lin, C., Naphade, M. R., Natsev, A., Hsu, W., Smith, J. R., Tešić, J., Yan, R., Zhang, D. IBM Research TRECVID-2004 Video Retrieval System. In Proceedings of the TRECVID 2004 Workshop (Gaithersburg, MD, November 2004), NIST Special Publications, 2004.
[4]
Amir A., Argillander J., Campbell M., Haubold A., Iyengar G., Ebadollahi S., Kang F., Naphade M. R., Natsev A., Smith J. R., Tešić J., and Volkmer T. IBM Research TRECVID-2005 Video Retrieval System. In Proceedings of the TRECVID 2005 Workshop (Gaithersburg, MD, November 2005), NIST Special Publications, 2005.
[5]
Bresenham, J. E. Algorithm for computer control of a digital plotter. In IBM Systems Journal, Vol. 4 (1), 1965, 25--30.
[6]
Campbell M., Haubold A., Ebadollahi S., Naphade M. R., Natsev P., Smith J. R., Tešić J., and Xie L. IBM Research TRECVID-2006 Video Retrieval System. In Proceedings of the TRECVID 2006 Workshop (Gaithersburg, MD, November 2006), NIST Special Publications, 2006.
[7]
Ewerth, R., Beringer, C., Kopp, T., Nievergall, M., Stadelmann, T., and Freisleben, B. University of Marburg at TRECVID 2005: Shot Boundary Detection and Camera Motion Estimation Results. In Proceedings of the TRECVID 2005 Workshop, NIST Special Publications, Gaithersburg, MD, Nov. 2005.
[8]
Huang, J., Kumar, S., Mitra, M., Zhu, W., and Zabih, R. Spatial Color Indexing and Applications. In International Journal of Computer Vision, Vol. 35 (3), Dec 1999, 245--268.
[9]
Lu, C., Drew, M. S., and Au, J. Classification of Summarized Videos using Hidden Markov Models on Compressed Chromaticity Signatures. In Proceedings of the ACM International Conference on Multimedia (MM '01) (Ottawa, CA, September 30 - October 4, 2001). ACM Press, New York, NY, 479--482.
[10]
Ma, Y.-F., Zhang, H.-J. Motion Pattern-Based Video Classification and Retrieval. In EURASIP Journal on Applied Signal Processing 2003:2, 199--208.
[11]
Naphade, M. R., Huang, M. Discovering Recurrent Events in Video Using Unsupervised Methods. In Proceedings of the International Conference on Image Processing (ICIP '02) (Rochester, NY, September 22--25, 2002), IEEE Press, New York, NY, 2002, II-13--II-16.
[12]
Naphade, M. R., Kennedy, L., Kender, J. R., Chang, S. F., Smith, J. R., Over P., and Hauptmann, A. LSCOM-lite: A Light Scale Concept Ontology for Multimedia Understanding for TRECVID 2005. IBM Research Technical Report, RC23612 (W0505-104), May, 2005.
[13]
Naphade, M. R. and Smith, J. R. On the Semantic Detection of Concepts at TRECVID. In Proc. of the ACM International Conference on Multimedia (MM '04) (New York, NY, October 10--16, 2004), ACM Press, New York, NY, 660--667.
[14]
Naphade, M. R., Wang, R., and Huang, T. S. Supporting audiovisual query using dynamic programming. In Proc. of the ACM International Conference on Multimedia (MM '01) (Ottawa, Canada, September 30--October 4, 2001). ACM Press, New York, NY, 2001, 411--420.
[15]
Snoek, C. G. M., and Worring, M. Multimedia Event-Based Video Indexing using Time Intervals. In IEEE Transactions on Multimedia, 7(4) (Aug. 2005), 638--647.
[16]
J. W. Davis. Hierarchical Motion History Images for Recognizing Human Motion. In Proceedings of the IEEE Workshop on Detection and Recognition of Events in Video (Vancouver, Canada, July 8, 2001). IEEE Press, New York, NY, 2001, 39--46.
[17]
J. C. Niebles, H. Wang, L. Fei-Fei. Unsupervised Learning of Human Action Categories Using Spatial-Temporal Words. In Proceedings of the British Machine Vision Conference (BMVC '06) (Edinburgh, United Kingdom, September 4--7, 2006). British Machine Vision Association, 2001
[18]
L. Zelnik-Manor, M. Irani. Statistical Analysis of Dynamic Actions. In IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 28, No. 9, September 2006. IEEE Press, 1530--1535.

Cited By

View all
  • (2022)A Framework Model for Integrating Social Media, the Web, and Proprietary Services Into YouTube Video Classification ProcessResearch Anthology on Applying Social Networking Strategies to Classrooms and Libraries10.4018/978-1-6684-7123-4.ch015(260-277)Online publication date: 8-Jul-2022
  • (2019)A Framework Model for Integrating Social Media, the Web, and Proprietary Services Into YouTube Video Classification ProcessInternational Journal of Multimedia Data Engineering and Management10.4018/IJMDEM.201904010210:2(21-36)Online publication date: 1-Apr-2019
  • (2015)Multimedia event detection with ℓ2-regularized logistic Gaussian mixture regressionNeural Computing and Applications10.1007/s00521-014-1810-y26:7(1561-1574)Online publication date: 1-Oct-2015
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
CIVR '07: Proceedings of the 6th ACM international conference on Image and video retrieval
July 2007
655 pages
ISBN:9781595937339
DOI:10.1145/1282280
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

In-Cooperation

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 09 July 2007

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. LSCOM
  2. MPEG motion vectors
  3. TRECVID
  4. motion features

Qualifiers

  • Article

Conference

CIVR07
Sponsor:

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 03 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2022)A Framework Model for Integrating Social Media, the Web, and Proprietary Services Into YouTube Video Classification ProcessResearch Anthology on Applying Social Networking Strategies to Classrooms and Libraries10.4018/978-1-6684-7123-4.ch015(260-277)Online publication date: 8-Jul-2022
  • (2019)A Framework Model for Integrating Social Media, the Web, and Proprietary Services Into YouTube Video Classification ProcessInternational Journal of Multimedia Data Engineering and Management10.4018/IJMDEM.201904010210:2(21-36)Online publication date: 1-Apr-2019
  • (2015)Multimedia event detection with ℓ2-regularized logistic Gaussian mixture regressionNeural Computing and Applications10.1007/s00521-014-1810-y26:7(1561-1574)Online publication date: 1-Oct-2015
  • (2014)Video Event Detection Using Motion Relativity and Feature SelectionIEEE Transactions on Multimedia10.1109/TMM.2014.231578016:5(1303-1315)Online publication date: Aug-2014
  • (2013)A reward-and-punishment-based approach for concept detection using adaptive ontology rulesACM Transactions on Multimedia Computing, Communications, and Applications10.1145/2457450.24574529:2(1-21)Online publication date: 10-May-2013
  • (2011)Event detection and recognition for semantic annotation of videoMultimedia Tools and Applications10.1007/s11042-010-0643-751:1(279-302)Online publication date: 1-Jan-2011
  • (2010)Motion data-driven model for semantic events classification using an optimized support vector machineProceedings of the ACM International Conference on Image and Video Retrieval10.1145/1816041.1816085(296-302)Online publication date: 5-Jul-2010
  • (2010)Learning automatic concept detectors from online videoComputer Vision and Image Understanding10.1016/j.cviu.2009.08.002114:4(429-438)Online publication date: 1-Apr-2010
  • (2010)Video event classification using string kernelsMultimedia Tools and Applications10.1007/s11042-009-0351-348:1(69-87)Online publication date: 1-May-2010
  • (2010)Semantic annotation of soccer videos by visual instance clustering and spatial/temporal reasoning in ontologiesMultimedia Tools and Applications10.1007/s11042-009-0342-448:2(313-337)Online publication date: 1-Jun-2010
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media