research-article

Sampling based scene-space video processing

Authors:

Jean-Charles Bazin,

Alexander Sorkine-HornungAuthors Info & Claims

ACM Transactions on Graphics (TOG), Volume 34, Issue 4

Article No.: 67, Pages 1 - 11

https://doi.org/10.1145/2766920

Published: 27 July 2015 Publication History

Abstract

Many compelling video processing effects can be achieved if per-pixel depth information and 3D camera calibrations are known. However, the success of such methods is highly dependent on the accuracy of this "scene-space" information. We present a novel, sampling-based framework for processing video that enables high-quality scene-space video effects in the presence of inevitable errors in depth and camera pose estimation. Instead of trying to improve the explicit 3D scene representation, the key idea of our method is to exploit the high redundancy of approximate scene information that arises due to most scene points being visible multiple times across many frames of video. Based on this observation, we propose a novel pixel gathering and filtering approach. The gathering step is general and collects pixel samples in scene-space, while the filtering step is application-specific and computes a desired output video from the gathered sample sets. Our approach is easily parallelizable and has been implemented on GPU, allowing us to take full advantage of large volumes of video data and facilitating practical runtimes on HD video using a standard desktop computer. Our generic scene-space formulation is able to comprehensively describe a multitude of video processing applications such as denoising, deblurring, super resolution, object removal, computational shutter functions, and other scene-space camera effects. We present results for various casually captured, hand-held, moving, compressed, monocular videos depicting challenging scenes recorded in uncontrolled environments.

Supplementary Material

ZIP File (a67-klose.zip)

Supplemental files

Download
703.08 MB

References

[1]

Alexa, M., Behr, J., Cohen-Or, D., Fleishman, S., Levin, D., and Silva, C. T. 2003. Computing and rendering point set surfaces. TVCG.

Digital Library

[2]

Aubry, M., Paris, S., Hasinoff, S. W., Kautz, J., and Durand, F. 2014. Fast local Laplacian filters: Theory and applications. ACM Trans. Graphics.

Digital Library

[3]

Bhat, P., Zitnick, C. L., Snavely, N., Agarwala, A., Agrawala, M., Cohen, M. F., Curless, B., and Kang, S. B. 2007. Using photographs to enhance videos of a static scene. In EGSR.

Digital Library

[4]

Cho, S., Wang, J., and Lee, S. 2012. Video deblurring for hand-held cameras using patch-based synthesis. ACM Trans. Graphics (Proc. SIGGRAPH).

Digital Library

[5]

Dabov, K., Foi, A., Katkovnik, V., and Egiazarian, K. O. 2007. Image denoising by sparse 3D transform-domain collaborative filtering. Trans. Image Processing.

Digital Library

[6]

Furukawa, Y., and Ponce, J. 2010. Accurate, dense, and robust multiview stereopsis. TPAMI.

Digital Library

[7]

Gastal, E. S. L., and Oliveira, M. M. 2011. Domain transform for edge-aware image and video processing. ACM Trans. Graphics (Proc. SIGGRAPH).

Digital Library

[8]

Goesele, M., Ackermann, J., Fuhrmann, S., Haubold, C., Klowsky, R., Steedly, D., and Szeliski, R. 2010. Ambient point clouds for view interpolation. ACM Trans. Graphics (Proc. SIGGRAPH).

Digital Library

[9]

Google, 2015. Project Tango. https://www.google.com/atap/projecttango/#project.

[10]

Granados, M., Kim, K. I., Andgtango Jan Kautz, J. T., and Theobalt, C. 2012. Background inpainting for videos with dynamic objects and a free-moving camera. In ECCV.

Digital Library

[11]

Gupta, A., Bhat, P., Dontcheva, M., Curless, B., Deussen, O., and Cohen, M. 2009. Enhancing and experiencing spacetime resolution with videos and stills. In ICCP.

[12]

Infognition, 2015. Infognition superresolution plugin. http://www.infognition.com/super_resolution/.

[13]

Joo, H., Park, H. S., and Sheikh, Y. 2014. Map visibility estimation for large-scale dynamic 3D reconstruction. In CVPR.

Digital Library

[14]

Kholgade, N., Simon, T., Efros, A. A., and Sheikh, Y. 2014. 3D object manipulation in a single photograph using stock 3D models. ACM Trans. Graphics (Proc. SIGGRAPH).

Digital Library

[15]

Kolev, K., Klodt, M., Brox, T., and Cremers, D. 2009. Continuous global optimization in multiview 3D reconstruction. IJCV.

Digital Library

[16]

Kopf, J., Cohen, M. F., Lischinski, D., and Uyttendaele, M. 2007. Joint bilateral upsampling. ACM Trans. Graphics (Proc. SIGGRAPH).

Digital Library

[17]

Kopf, J., Cohen, M. F., and Szeliski, R. 2014. First-person hyper-lapse videos. ACM Trans. Graphics (Proc. SIGGRAPH).

Digital Library

[18]

Kuster, C., Bazin, J.-C., Öztireli, A. C., Deng, T., Martin, T., Popa, T., and Gross, M. 2014. Spatio-temporal geometry fusion for multiple hybrid cameras using moving least squares surfaces. CGF (Eurographics).

Digital Library

[19]

Lang, M., Wang, O., Aydin, T. O., Smolic, A., and Gross, M. 2012. Practical temporal consistency for image-based graphics applications. ACM Trans. Graphics (Proc. SIGGRAPH).

Digital Library

[20]

Lipski, C., Klose, F., and Magnor, M. A. 2014. Correspondence and depth-image based rendering a hybrid approach for free-viewpoint video. T-CSVT.

[21]

Newcombe, R. A., and Davison, A. J. 2010. Live dense reconstruction with a single moving camera. In CVPR.

[22]

Öztireli, A. C., Guennebaud, G., and Gross, M. 2009. Feature preserving point set surfaces based on non-linear kernel regression. CGF (Eurographics).

[23]

Paris, S., Kornprobst, P., Tumblin, J., and Durand, F. 2007. A gentle introduction to bilateral filtering and its applications. In ACM SIGGRAPH courses.

Digital Library

[24]

Pritch, Y., Rav-Acha, A., and Peleg, S. 2008. Nonchronological video synopsis and indexing. TPAMI.

Digital Library

[25]

Richardt, C., Stoll, C., Dodgson, N. A., Seidel, H., and Theobalt, C. 2012. Coherent spatiotemporal filtering, upsampling and rendering of RGBZ videos. CGF (Eurographics).

Digital Library

[26]

Scharstein, D., and Szeliski, R. 2002. A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. IJCV.

Digital Library

[27]

Seitz, S. M., Curless, B., Diebel, J., Scharstein, D., and Szeliski, R. 2006. A comparison and evaluation of multi-view stereo reconstruction algorithms. In CVPR.

Digital Library

[28]

Shum, H., Chan, S., and Kang, S. B. 2007. Image-based rendering. Springer.

Digital Library

[29]

Sun, J., Xu, Z., and Shum, H. 2008. Image super-resolution using gradient profile prior. In CVPR.

[30]

Sunkavalli, K., Joshi, N., Kang, S. B., Cohen, M. F., and Pfister, H. 2012. Video snapshots: Creating high-quality images from video clips. TVCG.

Digital Library

[31]

Tanskanen, P., Kolev, K., Meier, L., Camposeco, F., Saurer, O., and Pollefeys, M. 2013. Live metric 3D reconstruction on mobile phones. In ICCV.

Digital Library

[32]

Vaish, V., Garg, G., Talvala, E.-V., Antunez, E., Wilburn, B., Horowitz, M., and Levoy, M. 2005. Synthetic aperture focusing using a shear-warp factorization of the viewing transform. In CVPR Workshop.

Digital Library

[33]

Wilburn, B., Joshi, N., Vaish, V., Talvala, E., Antúnez, E. R., Barth, A., Adams, A., Horowitz, M., and Levoy, M. 2005. High performance imaging using large camera arrays. ACM Trans. Graphics (Proc. SIGGRAPH).

Digital Library

[34]

Zhang, G., Dong, Z., Jia, J., Wan, L., Wong, T.-T., and Bao, H. 2009. Refilming with depth-inferred videos. TVCG.

Digital Library

[35]

Zhang, G., Jia, J., Wong, T., and Bao, H. 2009. Consistent depth maps recovery from a video sequence. TPAMI.

Digital Library

[36]

Zhang, L., Vaddadi, S., Jin, H., and Nayar, S. K. 2009. Multiple view image denoising. In CVPR.

[37]

Zitnick, C. L., Kang, S. B., Uyttendaele, M., Winder, S. A. J., and Szeliski, R. 2004. High-quality video view interpolation using a layered representation. ACM Trans. Graphics (Proc. SIGGRAPH).

Digital Library

[38]

Zwicker, M., Pfister, H., van Baar, J., and Gross, M. 2001. Surface splatting. In SIGGRAPH.

Digital Library

Cited By

Kremer RHerfet T(2024)ST-SAIL: Spatio-temporal Semantic Analysis of Light Fields: Optimizing the Sampling Pattern of Light Field Arrays2024 IEEE International Conference on Consumer Electronics (ICCE)10.1109/ICCE59016.2024.10444447(1-6)Online publication date: 6-Jan-2024
https://doi.org/10.1109/ICCE59016.2024.10444447
Kim JKim Y(2024)Depth-of-Field Region Detection and Recognition From a Single Image Using Adaptively Sampled Learning RepresentationIEEE Access10.1109/ACCESS.2024.337766712(42248-42263)Online publication date: 2024
https://doi.org/10.1109/ACCESS.2024.3377667
Yu EBlackburn-Matzen KNguyen CWang OHabib Kazi RBousseau A(2023)VideoDoodles: Hand-Drawn Animations on Videos with Scene-Aware CanvasesACM Transactions on Graphics10.1145/359241342:4(1-12)Online publication date: 26-Jul-2023
https://dl.acm.org/doi/10.1145/3592413
Show More Cited By

Index Terms

Sampling based scene-space video processing
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Image and video acquisition
        3D imaging
  2. Computer graphics
    1. Image manipulation
    2. Rendering

Recommendations

Geometrical plenoptic sampling
ICIP'09: Proceedings of the 16th IEEE international conference on Image processing

In this paper, we present a general framework for analysis of plenoptic sampling by investigating the spectral analysis of plenoptic imaging. The proposed framework provides a unified representation that generalizes several existing methods for ...
3D feature extraction from uncalibrated video clips
3DVP '10: Proceedings of the 1st international workshop on 3D video processing

This paper explores the idea of extracting a dense 3D point cloud corresponding to salient features in a video. The goal is to generate the dense point cloud efficiently, in order to use the information in various other video processing tasks. We ...
Plane-based multi-view inpainting for image-based rendering in large scenes
I3D '18: Proceedings of the ACM SIGGRAPH Symposium on Interactive 3D Graphics and Games

Image-Based Rendering (IBR) allows high-fidelity free-viewpoint navigation using only a set of photographs and 3D reconstruction as input. It is often necessary or convenient to remove objects from the captured scenes, allowing a form of scene editing ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Graphics

ACM Transactions on Graphics Volume 34, Issue 4

August 2015

1307 pages

ISSN:0730-0301

EISSN:1557-7368

DOI:10.1145/2809654

Issue’s Table of Contents

Copyright © 2015 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 July 2015

Published in TOG Volume 34, Issue 4

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

33
Total Citations
View Citations
883
Total Downloads

Downloads (Last 12 months)13
Downloads (Last 6 weeks)0

Reflects downloads up to 01 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Kremer RHerfet T(2024)ST-SAIL: Spatio-temporal Semantic Analysis of Light Fields: Optimizing the Sampling Pattern of Light Field Arrays2024 IEEE International Conference on Consumer Electronics (ICCE)10.1109/ICCE59016.2024.10444447(1-6)Online publication date: 6-Jan-2024
https://doi.org/10.1109/ICCE59016.2024.10444447
Kim JKim Y(2024)Depth-of-Field Region Detection and Recognition From a Single Image Using Adaptively Sampled Learning RepresentationIEEE Access10.1109/ACCESS.2024.337766712(42248-42263)Online publication date: 2024
https://doi.org/10.1109/ACCESS.2024.3377667
Yu EBlackburn-Matzen KNguyen CWang OHabib Kazi RBousseau A(2023)VideoDoodles: Hand-Drawn Animations on Videos with Scene-Aware CanvasesACM Transactions on Graphics10.1145/359241342:4(1-12)Online publication date: 26-Jul-2023
https://dl.acm.org/doi/10.1145/3592413
Paliwal ATsarov AKalantari N(2023)Implicit View-Time Interpolation of Stereo Videos Using Multi-Plane Disparities and Non-Uniform Coordinates2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)10.1109/CVPR52729.2023.00092(888-898)Online publication date: Jun-2023
https://doi.org/10.1109/CVPR52729.2023.00092
Ionascu AStefaniga SGaianu M(2023)Synthetic Football Sprite Animations Learned Across the PitchAdvances in Computational Collective Intelligence10.1007/978-3-031-41774-0_48(610-618)Online publication date: 22-Sep-2023
https://doi.org/10.1007/978-3-031-41774-0_48
Mai VNguyen D(2023)Few-Shots Novel Space-Time View Synthesis from Consecutive PhotosThe 12th Conference on Information Technology and Its Applications10.1007/978-3-031-36886-8_20(240-249)Online publication date: 26-Jul-2023
https://doi.org/10.1007/978-3-031-36886-8_20
Niu WXia KPan Y(2021)Contiguous Loss for Motion-Based, Non-Aligned Image DeblurringSymmetry10.3390/sym1304063013:4(630)Online publication date: 9-Apr-2021
https://doi.org/10.3390/sym13040630
Siddique ALee S(2021)Object-Wise Video EditingApplied Sciences10.3390/app1102067111:2(671)Online publication date: 12-Jan-2021
https://doi.org/10.3390/app11020671
Zhang XJiang RWang TWang J(2021)Recursive Neural Network for Video DeblurringIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2020.303572231:8(3025-3036)Online publication date: Aug-2021
https://doi.org/10.1109/TCSVT.2020.3035722
Li ZNiklaus SSnavely NWang O(2021)Neural Scene Flow Fields for Space-Time View Synthesis of Dynamic Scenes2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)10.1109/CVPR46437.2021.00643(6494-6504)Online publication date: Jun-2021
https://doi.org/10.1109/CVPR46437.2021.00643
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Issue’s Table of Contents