ABSTRACT
An increasing number of people regularly capture video in social occasions like weddings, parties and holiday trips. As a result, multiple video recordings are made from a single event providing different view angles and wider coverage. This gives an opportunity to produce a desired video summary from the event, combining the videos with the most favorable views from multiple recordings. In order to mix contents from different cameras, the recordings require very precise synchronization in time. This task is very tedious and presently done manually. We present two methods to synchronize multiple videos based on the identical audio content present in the recordings. The first method utilizes audio-classification and the synchronization between two recordings is determined by correlating the audio classes. The second method uses audio-fingerprints to represent the recorded audio. The synchronization is determined by fingerprint matches between the different recordings. The experimental results show that the audio-classification method requires recordings, at least a couple of minutes long, with large temporal overlap to determine the synchronization point. The method using audio-fingerprints requires at least 3 second long overlapping audio and resulted inperfect synchronization in all the examined cases.
- P. Shrestha, H. Weda, M. Barbieri, D. Sekulovski, Synchronization of multiple videos using still camera flashes", Proceedings of the 14th annual ACM international conference on Multimedia, pp 137--140, 2006. Google ScholarDigital Library
- RECOMMENDATION ITU-R BT.1359-1, Relative timing of sound and vision for broadcasting", 1998.Google Scholar
- A. Whitehead, R. Laganière, P. Bose, Temporal Synchronization of Video Sequences in Theory and in Practice", Proceedings of the 14th Brazilian Symposium on Computer Graphics and Image Processing, pp 132--137, 2005. Google ScholarDigital Library
- Y. Caspi, D. Simakov and M. Irani, Feature based sequence-to-sequence matching", In Proc. 2nd International Symposium on 3D Data Processing, Visualization, and Transmission, 2004.Google Scholar
- S. N. Sinha and M. Pollefeys, Visual-Hull Reconstruction from Uncalibrated and Unsynchronized Video Streams", In Proc. 19th IEEE Conf. on Computer Vision and Pattern Recognition, Vol. 2, pp. 682--689, June 2000.Google Scholar
- C. Lei and Y. H. Yang, Tri-Focal Tensor based Multiple Video Synchronization with Sub-Frame Optimization", IEEE Trans. on Image Processing, 2005.Google Scholar
- J. Haitsma, T. Kalker, A Highly Robust Audio Fingerprinting System", Proceedings of Int. Symposium on Music Information Retrieval, 2002.Google Scholar
- J. Breebaart and M. McKinney, Features for audio and music classifcation", Proceedings of Int. Symposium on Music Information Retrieval, 2003.Google Scholar
Index Terms
- Synchronization of multi-camera video recordings based on audio
Recommendations
Blind Clustering of Music Recordings Based on Audio Fingerprinting
IIH-MSP '09: Proceedings of the 2009 Fifth International Conference on Intelligent Information Hiding and Multimedia Signal ProcessingAlthough multiple music recordings may sound identical to a human listener, the underlying representations of sound may differ due to the variations in their audio encoding and/or transmission methods. In contrast to the existing audio-fingerprinting ...
Audio Fingerprinting System to Detect and Match Audio Recordings
Pattern Recognition and Machine IntelligenceAbstractThe emergence of a sizable volume of audio data has increased the requirement for audio retrieval, which can identify the required information rapidly and reliably. Audio fingerprint retrieval is a preferable substitute due to its improved ...
Audio-visual events for multi-camera synchronization
We present a multimodal method for the automatic synchronization of audio-visual recordings captured with a set of independent cameras. The proposed method jointly processes data from audio and video channels to estimate inter-camera delays that are ...
Comments