research-article

Tools for placing cuts and transitions in interview video

Authors:

Floraine Berthouzoz,

Maneesh AgrawalaAuthors Info & Claims

ACM Transactions on Graphics (TOG), Volume 31, Issue 4

Article No.: 67, Pages 1 - 8

https://doi.org/10.1145/2185520.2185563

Published: 01 July 2012 Publication History

Abstract

We present a set of tools designed to help editors place cuts and create transitions in interview video. To help place cuts, our interface links a text transcript of the video to the corresponding locations in the raw footage. It also visualizes the suitability of cut locations by analyzing the audio/visual features of the raw footage to find frames where the speaker is relatively quiet and still. With these tools editors can directly highlight segments of text, check if the endpoints are suitable cut locations and if so, simply delete the text to make the edit. For each cut our system generates visible (e.g. jump-cut, fade, etc.) and seamless, hidden transitions. We present a hierarchical, graph-based algorithm for efficiently generating hidden transitions that considers visual features specific to interview footage. We also describe a new data-driven technique for setting the timing of the hidden transition. Finally, our tools offer a one click method for seamlessly removing 'ums' and repeated words as well as inserting natural-looking pauses to emphasize semantic content. We apply our tools to edit a variety of interviews and also show how they can be used to quickly compose multiple takes of an actor narrating a story.

Supplementary Material

JPG File (tp168_12.jpg)

Download
20.82 KB

ZIP File (a67-berthouzoz.zip)

Supplemental material.

Download
55.35 MB

MP4 File (tp168_12.mp4)

Download
24.00 MB

References

[1]

Abel, J., and Glass, I. 1999. Radio: An illustrated guide. WBEZ Alliance Inc.

[2]

Agarwala, A., Zheng, K., Pal, C., Agrawala, M., Cohen, M., Curless, B., Salesin, D., and Szeliski, R. 2005. Panoramic video textures. Proc. SIGGRAPH 24, 3, 821--827.

Digital Library

[3]

Arya, S., Mount, D., Netanyahu, N., Silverman, R., and Wu, A. 1998. An optimal algorithm for approximate nearest neighbor searching fixed dimensions. Journal of ACM 45, 6, 891--923.

Digital Library

[4]

Athitsos, V., Alon, J., Sclaroff, S., and Kollios, G. 2004. BoostMap: A method for efficient approximate similarity rankings. Proc. CVPR, II:268--II:275.

Digital Library

[5]

Blanz, V., and Vetter, T. 1999. A morphable model for the synthesis of 3D faces. In Proc. SIGGRAPH, 187--194.

Digital Library

[6]

Boreczky, J., and Rowe, L. 1996. Comparison of video shot boundary detection techniques. JEI 5, 2, 122--128.

[7]

Bregler, C., and Omohundro, S. 1995. Nonlinear manifold learning for visual speech recognition. Proc. ICCV, 494--499.

Digital Library

[8]

Bregler, C., Covell, M., and Slaney, M. 1997. Video rewrite: Driving visual speech with audio. In Proc. SIGGRAPH, 353--360.

Digital Library

[9]

Brox, T., Bruhn, A., Papenberg, N., and Weickert, J. 2004. High accuracy optical flow estimation based on a theory for warping. Proc. ECCV, 25--36.

[10]

Casares, J., Long, A., Myers, B., Bhatnagar, R., Stevens, S., Dabbish, L., Yocum, D., and Corbett, A. 2002. Simplifying video editing using metadata. In Proc. DIS, 157--166.

Digital Library

[11]

Dalal, N., and Triggs, B. 2005. Histograms of oriented gradients for human detection. In Proc. CVPR, 886--893.

Digital Library

[12]

Dale, K., Sunkavalli, K., Johnson, M., Vlasic, D., Matusik, W., and Pfister, H. 2011. Video face replacement. Proc. SIGGRAPH ASIA 30, 6, 130:1--130:10.

[13]

Dragicevic, P., Ramos, G., Bibliowitcz, J., Nowrouzezahrai, D., Balakrishnan, R., and Singh, K. 2008. Video browsing by direct manipulation. Proc. CHI, 237--246.

Digital Library

[14]

Fowlkes, C., Belongie, S., Chung, F., and Malik, J. 2004. Spectral grouping using the nystrom method. PAMI 26, 2, 214--225.

Digital Library

[15]

Girgensohn, A., Boreczky, J., Chiu, P., Doherty, J., Foote, J., Golovchinsky, G., Uchihashi, S., and Wilcox, L. 2000. A semi-automatic approach to home video editing. Proc. UIST, 81--89.

Digital Library

[16]

Goldman, D., Gonterman, C., Curless, B., Salesin, D., and Seitz, S. 2008. Video object annotation, navigation, and composition. Proc. UIST, 3--12.

Digital Library

[17]

Gomes, J. 1999. Warping and morphing of graphical objects, vol. 1. Morgan Kaufmann.

Digital Library

[18]

Karrer, T., Weiss, M., Lee, E., and Borchers, J. 2008. DRAGON: A direct manipulation interface for frame-accurate in-scene video navigation. Proc. CHI, 247--250.

Digital Library

[19]

Kemelmacher-Shlizerman, I., Sankar, A., Shechtman, E., and Seitz, S. 2010. Being John Malkovich. Proc. ECCV, 341--353.

Digital Library

[20]

Kemelmacher-Shlizerman, I., Shechtman, E., Garg, R., and Seitz, S. 2011. Exploring photobios. ACM Trans. on Graph. (Proc. SIGGRAPH) 30, 4, 61:1--61:10.

Digital Library

[21]

Kwatra, V., Schodl, A., Essa, I., Turk, G., and Bobick, A. 2003. Graphcut textures: Image and video synthesis using graph cuts. Proc. SIGGRAPH 22, 3, 277--286.

Digital Library

[22]

Mahajan, D., Huang, F., Matusik, W., Ramamoorthi, R., and Belhumeur, P. 2009. Moving gradients: A path-based method for plausible image interpolation. Proc. SIGGRAPH 28, 3, 42:1--42:11.

Digital Library

[23]

O'Steen, B. 2009. The Invisible Cut: How Editors Make Movie Magic. Michael Wiese Productions.

[24]

Pighin, F., Hecker, J., Lischinski, D., Szeliski, R., and Salesin, D. 1998. Synthesizing realistic facial expressions from photographs. Proc. SIGGRAPH, 75--84.

Digital Library

[25]

Potamianos, G., Neti, C., Gravier, G., Garg, A., and Senior, A. 2003. Recent advances in the automatic recognition of audiovisual speech. Proc. IEEE 91, 9, 1306--1326.

[26]

Ranjan, A., Birnholtz, J., and Balakrishnan, R. 2008. Improving meeting capture by applying television production principles with audio and motion detection. In Proc. CHI, ACM, 227--236.

Digital Library

[27]

Saragih, J., Lucey, S., and Cohn, J. 2009. Face alignment through subspace constrained mean-shifts. ICCV, 1034--1041.

[28]

Schödl, A., and Essa, I. 2002. Controlled animation of video sprites. In Proc. SCA, 121--127.

Digital Library

[29]

Schödl, A., Szeliski, R., Salesin, D., and Essa, I. 2000. Video textures. Proc. SIGGRAPH, 489--498.

Digital Library

[30]

Shechtman, E., Rav-Acha, A., Irani, M., and Seitz, S. 2010. Regenerative morphing. Proc. CVPR, 615--622.

[31]

Truong, B., and Venkatesh, S. 2007. Video abstraction: A systematic review and classification. ACM TOMCCAP 3, 1.

Digital Library

[32]

Ueda, H., Miyatake, T., and Yoshizawa, S. 1991. IMPACT: An interactive natural-motion-picture dedicated multimedia authoring system. Proc. CHI, 343--350.

Digital Library

[33]

Virage. Audio analysis. http://www.virage.com/.

[34]

Wexler, Y., Shechtman, E., and Irani, M. 2007. Space-time completion of video. PAMI 29, 3, 463--476.

Digital Library

[35]

Zhang, H., Low, C., Smoliar, S., and Wu, J. 1995. Video parsing, retrieval and browsing: an integrated and content-based solution. Proc. Multimedia, 15--24.

Digital Library

[36]

Zhang, L., Snavely, N., Curless, B., and Seitz, S. 2004. Spacetime faces: High resolution capture for modeling and animation. Proc. SIGGRAPH, 548--558.

Digital Library

Cited By

Tilekbay BYang SLewkowicz MSuryapranata AKim J(2024)ExpressEdit: Video Editing with Natural Language and SketchingProceedings of the 29th International Conference on Intelligent User Interfaces10.1145/3640543.3645164(515-536)Online publication date: 18-Mar-2024
https://dl.acm.org/doi/10.1145/3640543.3645164
Li YXu HCai FTian F(2024)Improving AI-assisted video editing: Optimized footage analysis through multi-task learningNeurocomputing10.1016/j.neucom.2024.128485(128485)Online publication date: Aug-2024
https://doi.org/10.1016/j.neucom.2024.128485
Nußbaumer DMair BSchön SEdelsbrunner SEbner M(2024)Evaluating the Efficacy of Automated Video Editing in Educational Content Production: A Time Efficiency and Learner Perspective StudyLearning and Collaboration Technologies10.1007/978-3-031-61672-3_15(234-246)Online publication date: 29-Jun-2024
https://dl.acm.org/doi/10.1007/978-3-031-61672-3_15
Show More Cited By

Index Terms

Tools for placing cuts and transitions in interview video
1. Applied computing
  1. Document management and text processing
    1. Document capture
      1. Graphics recognition and interpretation
2. Computing methodologies
  1. Computer graphics

Recommendations

Life-sketch: a framework for sketch-based modelling and animation of 3D objects
AUIC '10: Proceedings of the Eleventh Australasian Conference on User Interface - Volume 106

The design and animation of digital 3D models is an essential task for many applications in science, engineering, education, medicine and arts. In many instances only an approximate representation is required and a simple and intuitive modelling and ...
Interview with Bill Kinder: January 13, 2005
Theoretical and Practical Computer Applications in Entertainment

Bill Kinder is the Director of Editorial and Postproduction at the Pixar Animation Studios. He is also the DVD Producer for Pixar movies, including the Academy award-winning films "The Incredibles (2004)" and "Finding Nemo (2003)," which earned the ...
Programmable transitions for video stream editing
AFRIGRAPH '09: Proceedings of the 6th International Conference on Computer Graphics, Virtual Reality, Visualisation and Interaction in Africa

Video editing applications provide a facility to transition from one video stream to another, or to filter a video stream in some way. New transitions are usually developed using a custom API for the particular package. In this article we present a ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Graphics

ACM Transactions on Graphics Volume 31, Issue 4

July 2012

935 pages

ISSN:0730-0301

EISSN:1557-7368

DOI:10.1145/2185520

Issue’s Table of Contents

Copyright © 2012 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 July 2012

Published in TOG Volume 31, Issue 4

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Division of Computing and Communication Foundations

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

58
Total Citations
View Citations
1,472
Total Downloads

Downloads (Last 12 months)47
Downloads (Last 6 weeks)9

Reflects downloads up to 19 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Tilekbay BYang SLewkowicz MSuryapranata AKim J(2024)ExpressEdit: Video Editing with Natural Language and SketchingProceedings of the 29th International Conference on Intelligent User Interfaces10.1145/3640543.3645164(515-536)Online publication date: 18-Mar-2024
https://dl.acm.org/doi/10.1145/3640543.3645164
Li YXu HCai FTian F(2024)Improving AI-assisted video editing: Optimized footage analysis through multi-task learningNeurocomputing10.1016/j.neucom.2024.128485(128485)Online publication date: Aug-2024
https://doi.org/10.1016/j.neucom.2024.128485
Nußbaumer DMair BSchön SEdelsbrunner SEbner M(2024)Evaluating the Efficacy of Automated Video Editing in Educational Content Production: A Time Efficiency and Learner Perspective StudyLearning and Collaboration Technologies10.1007/978-3-031-61672-3_15(234-246)Online publication date: 29-Jun-2024
https://dl.acm.org/doi/10.1007/978-3-031-61672-3_15
Sun JDeng LAfouras TOwens ADavis A(2023)Eventfulness for Interactive Video AlignmentACM Transactions on Graphics10.1145/359211842:4(1-10)Online publication date: 26-Jul-2023
https://dl.acm.org/doi/10.1145/3592118
Huh MYang SPeng YChen XKim YPavel A(2023)AVscript: Accessible Video Editing with Audio-Visual ScriptsProceedings of the 2023 CHI Conference on Human Factors in Computing Systems10.1145/3544548.3581494(1-17)Online publication date: 19-Apr-2023
https://dl.acm.org/doi/10.1145/3544548.3581494
Yang SKwak SLee JKim J(2023)Beyond Instructions: A Taxonomy of Information Types in How-to VideosProceedings of the 2023 CHI Conference on Human Factors in Computing Systems10.1145/3544548.3581126(1-21)Online publication date: 19-Apr-2023
https://dl.acm.org/doi/10.1145/3544548.3581126
Chen BZiai ATucker RXie Y(2023)Match Cutting: Finding Cuts with Smooth Visual Transitions2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)10.1109/WACV56688.2023.00215(2114-2124)Online publication date: Jan-2023
https://doi.org/10.1109/WACV56688.2023.00215
Vacchetti BArgaw DCequtelli T(2023)LEMMS: Label Estimation of Multi-feature Movie Segments2023 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW)10.1109/ICCVW60793.2023.00325(3019-3027)Online publication date: 2-Oct-2023
https://doi.org/10.1109/ICCVW60793.2023.00325
Wang BJin ZMysore G(2022)Record Once, Post Everywhere: Automatic Shortening of Audio Stories for Social MediaProceedings of the 35th Annual ACM Symposium on User Interface Software and Technology10.1145/3526113.3545680(1-11)Online publication date: 29-Oct-2022
https://dl.acm.org/doi/10.1145/3526113.3545680
Chi PDong TFrueh CColonna BKwatra VEssa I(2022)Synthesis-Assisted Video Prototyping From a DocumentProceedings of the 35th Annual ACM Symposium on User Interface Software and Technology10.1145/3526113.3545676(1-10)Online publication date: 29-Oct-2022
https://dl.acm.org/doi/10.1145/3526113.3545676
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Issue’s Table of Contents