APPENDICES and SUPPLEMENTS
|
|
Online appendix to designing mediation for context-aware applications. The appendix supports the information on page 91.
|
ABSTRACT
We propose a generic and robust framework for news video indexing which we founded on a broadcast news production model. We identify within this model four production phases, each providing useful metadata for annotation. In contrast to semiautomatic indexing approaches which exploit this information at production time, we adhere to an automatic data-driven approach. To that end, we analyze a digital news video using a separate set of multimodal detectors for each production phase. By combining the resulting production-derived features into a statistical classifier ensemble, the framework facilitates robust classification of several rich semantic concepts in news video; rich meaning that concepts share many similarities in their production process. Experiments on an archive of 120 hours of news video from the 2003 TRECVID benchmark show that a combined analysis of production phases yields the best results. In addition, we demonstrate that the accuracy of the proposed style analysis framework for classification of several rich semantic concepts is state-of-the-art.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Adams, B., Dorai, C., and Venkatesh, S. 2002. Toward automatic extraction of expressive elements from motion pictures: Tempo. IEEE Trans. Multimedia 4, 4, 472--481.
|
| |
2
|
Adams, W. H., Iyengar, G., Lin, C.-Y., Naphade, M., Neti, C., Nock, H., and Smith, J. 2003. Semantic indexing of multimedia content using visual, audio, and text cues. EURASIP J. Appl. Signal Process. 2003, 2, 170--185.
|
| |
3
|
Amir, A., Berg, M., Chang, S.-F., Hsu, W., Iyengar, G., Lin, C.-Y., Naphade, M., Natsev, A., Neti, C., Nock, H., Smith, J., Tseng, B., Wu, Y., and Zhang, D. 2003. IBM research TRECVID-2003 video retrieval system. In Proceedings of the TRECVID Workshop. NIST Special Publication. Gaithersburg, Md.
|
| |
4
|
Baan, J., Ballegooij, A., Geusebroek, J., Hiemstra, D., den Hartog, J., List, J., Snoek, C., Patras, I., Raaijmakers, S., Todoran, L., Vendrig, J., de Vries, A., Westerveld, T., and Worring, M. 2001. Lazy users and automatic video retrieval tools in (the) lowlands. In Proceedings of the 10th Text REtrieval Conference, E. Voorhees and D. Harman, eds. NIST Special Publication, vol. 500-250. Gaithersburg, Md.
|
| |
5
|
Barry, B. and Davenport, G. 2003. Documenting life: Videography and common sense. In Proceedings of the IEEE International Conference on Multimedia & Expo. Baltimore, Md.
|
| |
6
|
Boggs, J. and Petrie, D. 2000. The Art of Watching Films, 5th ed. Mayfield Publishing Mountain View, Calif.
|
| |
7
|
Bordwell, D. and Thompson, K. 1997. Film Art: An Introduction, 5th ed. McGraw-Hill, New York.
|
| |
8
|
|
| |
9
|
Chang, C.-C. and Lin, C.-J. 2001. LIBSVM : A Library for Support Vector Machines. Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm/.
|
| |
10
|
|
| |
11
|
|
| |
12
|
|
| |
13
|
|
| |
14
|
Hanjalic, A., Lagendijk, R., and Biemond, J. 1999. Automated high-level movie segmentation for advanced video-retrieval systems. IEEE Trans. Circuits Syst. Video Technol. 9, 4, 580--588.
|
| |
15
|
Hauptmann, A., Baron, R., Chen, M.-Y., Christel, M., Duygulu, P., Huang, C., Jin, R., Lin, W.-H., Ng, T., Moraveji, N., Papernick, N., Snoek, C., Tzanetakis, G., Yang, J., Yan, R., and Wactlar, H. 2003. Informedia at TRECVID 2003: Analyzing and searching broadcast news video. In Proceedings of the TRECVID Workshop. NIST Special Publication. Gaithersburg, Md.
|
| |
16
|
|
| |
17
|
|
| |
18
|
Naphade, M., Kozintsev, I., and Huang, T. 2002. A factor graph framework for semantic video indexing. IEEE Trans. Circuits Syst. Video Technol. 12, 1, 40--52.
|
| |
19
|
Platt, J. 2000. Probabilities for SV machines. In Advances in Large Margin Classifiers, A. Smola, et al., eds. MIT Press, Cambridge, Mass., 61--74.
|
| |
20
|
Quénot, G., Moraru, D., Besacier, L., and Mulhem, P. 2002. CLIPS at TREC -11: Experiments in video retrieval. In Proceedings of the 11th Text REtrieval Conference, E. Voorhees and L. Buckland, eds. NIST Special Publication, vol. 500-251. Gaithersburg, Md.
|
| |
21
|
|
| |
22
|
|
| |
23
|
|
| |
24
|
|
| |
25
|
Smeaton, A., Kraaij, W., and Over, P. 2003. TRECVID 2003---An introduction. In Proceedings of the TRECVID Workshop. NIST Special Publication. Gaithersburg, Md.
|
| |
26
|
|
| |
27
|
|
| |
28
|
Snoek, C. and Worring, M. 2005a. Multimedia event-based video indexing using time intervals. IEEE Trans. Multimedia 7, 4, 638--647.
|
| |
29
|
|
| |
30
|
Snoek, C., Worring, M., and Hauptmann, A. 2004. Detection of TV news monologues by style analysis. In Proceedings of the IEEE International Conference on Multimedia & Expo. Taipei, Taiwan.
|
| |
31
|
Sundaram, H. and Chang, S.-F. 2002. Computable scenes and structures in films. IEEE Trans. Multimedia 4, 4, 482--491.
|
| |
32
|
|
| |
33
|
Tseng, B., Lin, C.-Y., Naphade, M., Natsev, A., and Smith, J. 2003. Normalized classifier fusion for semantic visual concept detection. In Proceedings of the IEEE International Conference on Image Processing. vol. 2. Barcelona, Spain, 535--538.
|
| |
34
|
|
| |
35
|
Vendrig, J. and Worring, M. 2002. Systematic evaluation of logical story unit segmentation. IEEE Trans. Multimedia 4, 4, 492--499.
|
| |
36
|
|
| |
37
|
|
| |
38
|
|
| |
39
|
Zhang, H.-J., Wu, J., Zhong, D., and Smoliar, S. 1997. An integrated system for content-based video retrieval and browsing. Pattern Recogn. 30, 4, 643--658.
|
|