skip to main content
research-article

Blind video temporal consistency

Published: 02 November 2015 Publication History

Abstract

Extending image processing techniques to videos is a non-trivial task; applying processing independently to each video frame often leads to temporal inconsistencies, and explicitly encoding temporal consistency requires algorithmic changes. We describe a more general approach to temporal consistency. We propose a gradient-domain technique that is blind to the particular image processing algorithm. Our technique takes a series of processed frames that suffers from flickering and generates a temporally-consistent video sequence. The core of our solution is to infer the temporal regularity from the original unprocessed video, and use it as a temporal consistency guide to stabilize the processed sequence. We formally characterize the frequency properties of our technique, and demonstrate, in practice, its ability to stabilize a wide range of popular image processing techniques including enhancement and stylization of color and tone, intrinsic images, and depth estimation.

Supplementary Material

ZIP File (a196-bonneel.zip)
Supplemental files.

References

[1]
Aubry, M., Paris, S., Hasinoff, S., Kautz, J., and Durand, F. 2014. Fast local laplacian filters: Theory and applications. ACM Trans. on Graphics (SIGGRAPH).
[2]
Aydin, T. O., Stefanoski, N., Croci, S., Gross, M., and Smolic, A. 2014. Temporally coherent local tone mapping of hdr video. ACM Trans. Graph. 33, 6 (Nov.), 196:1--196:13.
[3]
Barnes, C., Shechtman, E., Finkelstein, A., and Goldman, D. B. 2009. PatchMatch: A randomized correspondence algorithm for structural image editing. ACM Trans. on Graphics (SIGGRAPH) 28, 3.
[4]
Bell, S., Bala, K., and Snavely, N. 2014. Intrinsic images in the wild. ACM Trans. on Graphics (SIGGRAPH) 33, 4.
[5]
Besse, F., Rother, C., Fitzgibbon, A., and Kautz, J. 2012. Pmbp: Patchmatch belief propagation for correspondence field estimation. In BMVC - Best Industrial Impact Prize award.
[6]
Bhat, P., Curless, B., Cohen, M., and Zitnick, C. L. 2008. Fourier analysis of the 2d screened poisson equation for gradient domain problems. In ECCV, 114--128.
[7]
Bhat, P., Zitnick, C. L., Cohen, M., and Curless, B. 2010. Gradientshop: A gradient-domain optimization framework for image and video filtering. ACM Trans Graph (SIGGRAPH) 29, 2.
[8]
Bonneel, N., Sunkavalli, K., Paris, S., and Pfister, H. 2013. Example-based video color grading. ACM Trans. on Graphics (SIGGRAPH) 32, 4.
[9]
Bonneel, N., Sunkavalli, K., Tompkin, J., Sun, D., Paris, S., and Pfister, H. 2014. Interactive intrinsic video editing. ACM Trans. on Graphics (SIGGRAPH Asia) 33, 6.
[10]
Bonneel, N., Rabin, J., Peyr'e, G., and Pfister, H. 2015. Sliced and radon wasserstein barycenters of measures. Journal of Mathematical Imaging and Vision 51, 1, 2245.
[11]
Butler, D. J., Wulff, J., Stanley, G. B., and Black, M. J. 2012. A naturalistic open source movie for optical flow evaluation. In European Conf. on Computer Vision (ECCV), 611--625.
[12]
Chen, J., Paris, S., and Durand, F. 2007. Real-time edge-aware image processing with the bilateral grid. ACM Trans. on Graphics (SIGGRAPH).
[13]
Delon, J., and Desolneux, A. 2010. Stabilization of flicker-like effects in image sequences through local contrast correction. SIAM Journal on Imaging Sciences 3, 4, 703--734.
[14]
Dong, X., Bonev, B., Zhu, Y., and Yuille, A. L. 2015. Region-based temporally consistent video post-processing. In IEEE Conference on Computer Vision and Pattern Recognition.
[15]
Durand, F., and Dorsey, J. 2002. Fast bilateral filtering for the display of high-dynamic-range images. In Proc. of the 29th Annual Conference on Computer Graphics and Interactive Techniques, ACM, SIGGRAPH '02, 257--266.
[16]
Eigen, D., Puhrsch, C., and Fergus, R. 2014. Depth map prediction from a single image using a multi-scale deep network. In NIPS'14, 2366--2374.
[17]
Elder, J. H. 1999. Are edges incomplete? Int. J. Comput. Vision 34, 2-3 (Oct.), 97--122.
[18]
Farbman, Z., and Lischinski, D. 2011. Tonal stabilization of video. ACM Trans. on Graphics (SIGGRAPH) 30, 4, 89:1--89:9.
[19]
Fattal, R., Lischinski, D., and Werman, M. 2002. Gradient domain high dynamic range compression. ACM Trans. on Graphics (SIGGRAPH).
[20]
Gijsenij, A., Gevers, T., and van de weijer, J. 2010. Generalized gamut mapping using image derivative structures for color constancy. Int. J. Comput. Vision 86, 2-3, 127--139.
[21]
Gijsenij, A., Gevers, T., and van de Weijer, J. 2012. Improving color constancy by photometric edge weighting. IEEE Trans on Pattern Analysis and Machine Intelligence 34, 5, 918--929.
[22]
HaCohen, Y., Shechtman, E., Goldman, D. B., and Lischinski, D. 2011. Non-rigid dense correspondence with applications for image enhancement. ACM Trans. on Graphics (SIGGRAPH) 30, 4, 70:1--70:9.
[23]
He, K., Sun, J., and Tang, X. 2009. Single image haze removal using dark channel prior. In IEEE Conference on Computer Vision and Pattern Recognition, 1956--1963.
[24]
Hsu, E., Mertens, T., Paris, S., Avidan, S., and Durand, F. 2008. Light mixture estimation for spatially varying white balance. ACM Trans. on Graphics (SIGGRAPH), 70:1--70:7.
[25]
Kalantari, N. K., Shechtman, E., Barnes, C., Darabi, S., Goldman, D. B., and Sen, P. 2013. Patch-based High Dynamic Range Video. ACM Trans. Graph. (SIGGRAPH Asia) 32, 6.
[26]
Kong, N., Gehler, P. V., and Black, M. J. 2014. Intrinsic video. In Eur. Conf. Comp. Vision (ECCV), vol. 8690, 360--375.
[27]
Kronander, J., Gustavson, S., Bonnet, G., and Unger, J. 2013. Unified hdr reconstruction from raw cfa data. IEEE Int. Conference on Computational Photography (ICCP).
[28]
Lang, M., Wang, O., Aydin, T., Smolic, A., and Gross, M. 2012. Practical temporal consistency for image-based graphics applications. ACM Trans. Graph. (SIGGRAPH) 31, 4, 34:1--34:8.
[29]
Liu, C. 2009. Beyond Pixels: Exploring New Representations and Applications for Motion Analysis. PhD thesis, Massachusetts Institute of Technology.
[30]
Paris, S., Hasinoff, S. W., and Kautz, J. 2011. Local laplacian filters: Edge-aware image processing with a laplacian pyramid. ACM Trans. on Graphics (SIGGRAPH), 68:1--68:12.
[31]
Paris, S. 2008. Edge-preserving smoothing and mean-shift segmentation of video streams. In ECCV.
[32]
Pérez, P., Gangnet, M., and Blake, A. 2003. Poisson image editing. ACM Trans. on Graphics (SIGGRAPH) 22, 3.
[33]
Pitié, F., Dahyot, R., Kelly, F., and Kokaram, A. 2004. A new robust technique for stabilizing brightness fluctuations in image sequences. In Statistical Methods in Video Processing. Springer, 153--164.
[34]
Pitié, F., Kent, B., Collis, B., and Kokaram, A. 2006. Localised deflicker of moving images. In IEEE European Conference on Visual Media Production.
[35]
RE:Vision, 2015. De:flicker v.1.3.0. http://www.revisionfx.com/products/deflicker/.
[36]
Roo, J. S., and Richardt, C. 2014. Temporally coherent video de-anaglyph. ACM Trans. on Graphics (SIGGRAPH).
[37]
Shahrian, E., Rajan, D., Price, B., and Cohen, S. 2013. Improving image matting using comprehensive sampling sets. In IEEE Conf. Comp. Vision and Pattern Recognition, 636--643.
[38]
Sun, D., Roth, S., and Black, M. J. 2014. A quantitative analysis of current practices in optical flow estimation and the principles behind them. Int. J. Comput. Vision 106, 2, 115--137.
[39]
Tang, K., Yang, J., and Wang, J. 2014. Investigating haze-relevant features in a learning framework for image dehazing. In IEEE Conf. Comp. Vision and Pattern Recognition, 2995--3002.
[40]
van Roosmalen, P. M. B. 1999. Restoration of archived film and video. TU Delft.
[41]
Wang, C.-M., Huang, Y.-H., and Huang, M.-L. 2006. An effective algorithm for image sequence color transfer. Mathematical and Computer Modelling 44, 78, 608--627.
[42]
Weinstock, R. 1974. Calculus of variations : with applications to physics and engineering. Dover books on advanced mathematics. Dover. Originally published by McGraw-Hill, in 1952.
[43]
Werlberger, M., Pock, T., and Bischof, H. 2010. Motion estimation with non-local total variation regularization. In IEEE Conference on Computer Vision and Pattern Recognition.
[44]
Winnemöller, H., Olsen, S. C., and Gooch, B. 2006. Real-time video abstraction. ACM Trans. on Graphics (SIGGRAPH), 1221--1226.
[45]
Wulff, J., and Black, M. J. 2015. Efficient sparse-to-dense optical flow estimation using a learned basis and layers. In IEEE Conference on Computer Vision and Pattern Recognition.
[46]
Ye, G., Garces, E., Liu, Y., Dai, Q., and Gutierrez, D. 2014. Intrinsic Video and Applications. ACM Trans. Graph. (SIGGRAPH) 33, 4.
[47]
Zhao, Q., Tan, P., Dai, Q., Shen, L., Wu, E., and Lin, S. 2012. A closed-form solution to retinex with nonlocal texture constraints. IEEE Trans. Pattern Anal. Mach. Intell. 34, 7, 1437--1444.

Cited By

View all
  • (2025)NVDS$^{\mathbf{+}}$+: Towards Efficient and Versatile Neural Stabilizer for Video Depth EstimationIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2024.347638747:1(583-600)Online publication date: 1-Jan-2025
  • (2024)Video Colorization Based on Variational AutoencoderElectronics10.3390/electronics1312241213:12(2412)Online publication date: 20-Jun-2024
  • (2024)Temporal Optimization for Face Swapping Video based on Consistency InheritanceProceedings of the ACM Turing Award Celebration Conference - China 202410.1145/3674399.3674457(165-170)Online publication date: 5-Jul-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Graphics
ACM Transactions on Graphics  Volume 34, Issue 6
November 2015
944 pages
ISSN:0730-0301
EISSN:1557-7368
DOI:10.1145/2816795
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 02 November 2015
Published in TOG Volume 34, Issue 6

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. temporal consistency
  2. video processing

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)104
  • Downloads (Last 6 weeks)13
Reflects downloads up to 01 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2025)NVDS$^{\mathbf{+}}$+: Towards Efficient and Versatile Neural Stabilizer for Video Depth EstimationIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2024.347638747:1(583-600)Online publication date: 1-Jan-2025
  • (2024)Video Colorization Based on Variational AutoencoderElectronics10.3390/electronics1312241213:12(2412)Online publication date: 20-Jun-2024
  • (2024)Temporal Optimization for Face Swapping Video based on Consistency InheritanceProceedings of the ACM Turing Award Celebration Conference - China 202410.1145/3674399.3674457(165-170)Online publication date: 5-Jul-2024
  • (2024)Towards Photorealistic Video Colorization via Gated Color-Guided Image Diffusion ModelsProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681356(10891-10900)Online publication date: 28-Oct-2024
  • (2024)MovingColor: Seamless Fusion of Fine-grained Video Color EnhancementProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681130(7454-7463)Online publication date: 28-Oct-2024
  • (2024)DeepEnhancer: Temporally Consistent Focal Transformer for Comprehensive Video EnhancementProceedings of the 2024 International Conference on Multimedia Retrieval10.1145/3652583.3658031(969-977)Online publication date: 30-May-2024
  • (2024)Reference-based Video Colorization with AB Chrominance Point and Temporal PropagationProceedings of the 2024 16th International Conference on Machine Learning and Computing10.1145/3651671.3651767(340-346)Online publication date: 2-Feb-2024
  • (2024)Unsupervised Model-based Learning for Simultaneous Video Deflickering and Deblotching2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)10.1109/WACV57701.2024.00407(4105-4113)Online publication date: 3-Jan-2024
  • (2024)Intrinsic Omnidirectional Video Decomposition2024 IEEE Conference on Virtual Reality and 3D User Interfaces Abstracts and Workshops (VRW)10.1109/VRW62533.2024.00165(737-738)Online publication date: 16-Mar-2024
  • (2024)Analysis and Benchmarking of Extending Blind Face Image Restoration to VideosIEEE Transactions on Image Processing10.1109/TIP.2024.346341433(5676-5687)Online publication date: 1-Jan-2024
  • Show More Cited By

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media