|
ABSTRACT
In this paper, we propose a novel system that is able to automatically detect and classify highlights from baseball game videos in TV broadcast. The digest system gives complete indexes of a baseball game which cover all of the status changes in a game. We achieve this by seamlessly integrating image, audio and speech clues using a maximum entropy based method. What distinguishes our system from previous ones is that we emphasize on the integration of multimedia features and the acquisition of domain knowledge through machine learning process. Integration of multimedia features is important because with the current state-of-the-art image and audio analysis techniques, most image and audio features we can extract from videos are very low level, and detecting/classifying sports game highlights based on features from single medium are doomed to yield poor performances. Acquiring domain knowledge through learning process is preferred over heuristic rules because machine learning process is more powerful for discovering and expressing domain knowledge. We perform extensive experiments on game videos including various stadiums, teams and broadcasted by different TV stations.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
N. Babaguchi. Towards abstracting sports video by highlights. In IEEE Conference on Multimedia and Expo, pages 1519--1522, 2000.
|
| |
2
|
A. Bonzanini, R. Leonardi, and P. Migliorati. Event recognition in sport programs using low-level motion indices. In IEEE Conference on Multimedia and Expo, pages 1208--1211, 2001.
|
| |
3
|
Y. L. Chang, W. Zeng, I. Kamel, and R. Alonso. Integrated image and speech analysis for content-based video indexing. In International Conference on Multimedia Computing and Systems, pages 306--313, 1996.
|
| |
4
|
J. N. Darroch and D. Ratcliff. Generalized iterative scaling for log-linear models. Annals of Mathematical Statistics, 43:1470--1480, 1972.
|
| |
5
|
|
| |
6
|
A. Jaimes and S. F. Chang. Automatic selection of visual features and classifiers. In SPIE Conference on Storage and Retrieval for Media Databases, pages 3972: 346--358, 2000.
|
| |
7
|
T. Kawashima, K. Yoshino, and Y. Aoki. Qualitative image analysis of group behaviour. In CVPR94, pages 690--693.
|
| |
8
|
V. Kobla, D. DeMenthon, and D. Doermann. Identification of sports videos using replay, text, and camera motion features. In SPIE Conference on Storage and Retrieval for Media Databases, pages 3972: 332--343, 2000.
|
| |
9
|
H. Pan, P. van Beek, and M. I. Sezan. Detection of slow-motion replay segments in sports video for highlights generation. In International Conference on Acoustics, Speech, and Signal Processing, pages III: 1649--1652, 2001.
|
 |
10
|
|
| |
11
|
P. Xu, L. Xie, S. F. Chang, A. Divakaran, A. Vetro, and H. Sun. Algorithms and system for segmentation and structure analysis in soccer video. In IEEE Conference on Multimedia and Expo, pages 928--931, 2001.
|
| |
12
|
D. Zhong and S. F. Chang. Structure analysis of sports video using domain models. In IEEE Conference on Multimedia and Expo, pages 920--923, 2001.
|
CITED BY 9
|
Ling-Yu Duan , Min Xu , Tat-Seng Chua , Qi Tian , Chang-Sheng Xu, A mid-level representation framework for semantic sports video analysis, Proceedings of the eleventh ACM international conference on Multimedia, November 02-08, 2003, Berkeley, CA, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
Yifan Zhang , Xiaoyu Zhang , Changsheng Xu , Hanqing Lu, Personalized retrieval of sports video, Proceedings of the international workshop on Workshop on multimedia information retrieval, September 24-29, 2007, Augsburg, Bavaria, Germany
|
|
Changsheng Xu , Jinjun Wang , Kongwah Wan , Yiqun Li , Lingyu Duan, Live sports event detection based on broadcast video and web-casting text, Proceedings of the 14th annual ACM international conference on Multimedia, October 23-27, 2006, Santa Barbara, CA, USA
|
|
|
|
|
|
|