research-article

Laplacian Pyramid of Conditional Variational Autoencoders

Authors:
Garoe Dorta

University of Bath, Anthropics Technology Ltd.

University of Bath, Anthropics Technology Ltd.
View Profile

,
Sara Vicente

Anthropics Technology Ltd.

Anthropics Technology Ltd.
View Profile

,
Lourdes Agapito

University College London

University College London
View Profile

,
Neill D.F. Campbell

University of Bath

University of Bath
View Profile

,
Simon Prince

Anthropics Technology Ltd.

Anthropics Technology Ltd.
View Profile

,
Ivor Simpson

Anthropics Technology Ltd.

Anthropics Technology Ltd.
View Profile

CVMP '17: Proceedings of the 14th European Conference on Visual Media Production (CVMP 2017)December 2017Article No.: 7Pages 1–9https://doi.org/10.1145/3150165.3150172

Published:11 December 2017Publication History

CVMP '17: Proceedings of the 14th European Conference on Visual Media Production (CVMP 2017)

Pages 1–9

ABSTRACT

Variational Autoencoders (VAE) learn a latent representation of image data that allows natural image generation and manipulation. However, they struggle to generate sharp images. To address this problem, we propose a hierarchy of VAEs analogous to a Laplacian pyramid. Each network models a single pyramid level, and is conditioned on the coarser levels. The Laplacian architecture allows for novel image editing applications that take advantage of the coarse to fine structure of the model. Our method achieves lower reconstruction error in terms of MSE, which is the loss function of the VAE and is not directly minimised in our model. Furthermore, the reconstructions generated by the proposed model are preferred over those from the VAE by human evaluators.

References

Martín Abadi, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen, Craig Citro, Greg S. Corrado, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Ian Goodfellow, Andrew Harp, Geoffrey Irving, Michael Isard, Yangqing Jia, Rafal Jozefowicz, Lukasz Kaiser, Manjunath Kudlur, Josh Levenberg, Dan Mané, Rajat Monga, Sherry Moore, Derek Murray, Chris Olah, Mike Schuster, Jonathon Shlens, Benoit Steiner, Ilya Sutskever, Kunal Talwar, Paul Tucker, Vincent Vanhoucke, Vijay Vasudevan, Fernanda Viégas, Oriol Vinyals, Pete Warden, Martin Wattenberg, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng. 2015. TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems. (2015). Software available from tensorflow.org.Google Scholar
Andrew Brock, Theodore Lim, JM Ritchie, and Nick Weston. 2016. Neural photo editing with introspective adversarial networks. arXiv preprint arXiv:1609.07093 (2016).Google Scholar
Garrison W Cottrell, Paul Munro, and David Zipser. 1987. Learning internal representations from gray-scale images: An example of extensional programming. In Conference of the Cognitive Science Society.Google Scholar
Emily L Denton, Soumith Chintala, Arthur Szlam, and Rob Fergus. 2015. Deep Generative Image Models using a Laplacian Pyramid of Adversarial Networks. In Advances in Neural Information Processing Systems. 1486--1494. Google ScholarDigital Library
Jon Gauthier. 2014. Conditional generative adversarial nets for convolutional face generation. Class Project for Stanford CS231N: Convolutional Neural Networks for Visual Recognition, Winter semester 2014 (2014), 5.Google Scholar
Ian Goodfellow, Yoshua Bengio, and Aaron Courville. 2016. Deep Learning. MIT Press. Google ScholarDigital Library
Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative adversarial nets. In Advances in Neural Information Processing Systems. 2672--2680. Google ScholarDigital Library
Ishaan Gulrajani, Kundan Kumar, Faruk Ahmed, Adrien Ali Taiga, Francesco Visin, David Vazquez, and Aaron Courville. 2017. PixelVAE: A Latent Variable Model for Natural Images. In International Conference on Learning Representations (ICLR).Google Scholar
Matthew D Hoffman and Matthew J Johnson. 2016. ELBO surgery: yet another way to carve up the variational evidence lower bound. In NIPS 2016 Workshop on Advances in Approximate Bayesian Inference.Google Scholar
Diederik P Kingma and Max Welling. 2014. Auto-encoding variational bayes. International Conference on Learning Representations (2014).Google Scholar
Alexander Kolesnikov and Christoph H Lampert. 2016. Deep Probabilistic Modeling of Natural Images using a Pyramid Decomposition. arXiv preprint arXiv:1612.08185 (2016).Google Scholar
A. Lamb, V. Dumoulin, and A. Courville. 2016. Discriminative Regularization for Generative Models. arXiv preprint arXiv:1612.03220 (2016).Google Scholar
Anders Boesen Lindbo Larsen, Søren Kaae Sønderby, and Ole Winther. 2016. Autoencoding beyond pixels using a learned similarity metric. In Proceedings of The 33rd International Conference on Machine Learning, Vol. 48. JMLR, 1558--1566. Google ScholarDigital Library
Ziwei Liu, Ping Luo, Xiaogang Wang, and Xiaoou Tang. 2015. Deep Learning Face Attributes in the Wild. In Proceedings of International Conference on Computer Vision (ICCV). Google ScholarDigital Library
Aaron Van den Oord, Nal Kalchbrenner, and Koray Kavukcuoglu. 2016. Pixel Recurrent Neural Networks. In Proceedings of The 33rd International Conference on Machine Learning, Vol. 48. 1747--1756. Google ScholarDigital Library
Patrick Pérez, Michel Gangnet, and Andrew Blake. 2003. Poisson Image Editing. ACM Trans. Graph. 22, 3 (July 2003), 313--318. Google ScholarDigital Library
Alec Radford, Luke Metz, and Soumith Chintala. 2016. Unsupervised representation learning with deep convolutional generative adversarial networks. In International Conference on Learning Representations (ICLR).Google Scholar
Scott Reed, Zeynep Akata, Santosh Mohan, Samuel Tenka, Bernt Schiele, and Honglak Lee. 2016. Learning What and Where to Draw. In NIPS. Google ScholarDigital Library
Tim Salimans, Ian Goodfellow, Wojciech Zaremba, Vicki Cheung, Alec Radford, Xi Chen, and Xi Chen. 2016. Improved Techniques for Training GANs. In Advances in Neural Information Processing Systems 29. 2234--2242. Google ScholarDigital Library
Paul Upchurch, Jacob Gardner, Kavita Bala, Robert Pless, Noah Snavely, and Kilian Weinberger. 2016. Deep Feature Interpolation for Image Content Changes. arXiv preprint arXiv:1611.05507 (2016).Google Scholar
Xinchen Yan, Jimei Yang, Kihyuk Sohn, and Honglak Lee. 2016. Attribute2Image: Conditional Image Generation from Visual Attributes. Proceedings of European Conference on Computer Vision (ECCV) (2016).Google ScholarCross Ref
Han Zhang, Tao Xu, Hongsheng Li, Shaoting Zhang, Xiaolei Huang, Xiaogang Wang, and Dimitris Metaxas. 2016. StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks. arXiv preprint arXiv:1612.03242 (2016).Google Scholar
Jun-Yan Zhu, Philipp Krähenbühl, Eli Shechtman, and Alexei A. Efros. 2016. Generative Visual Manipulation on the Natural Image Manifold. In Proceedings of European Conference on Computer Vision (ECCV).Google Scholar

Index Terms

Laplacian Pyramid of Conditional Variational Autoencoders
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
  2. Machine learning
    1. Machine learning approaches
      1. Neural networks

Recommendations

Variational Diffusion Autoencoders with Random Walk Sampling
Computer Vision – ECCV 2020
Abstract
Variational autoencoders (VAEs) and generative adversarial networks (GANs) enjoy an intuitive connection to manifold learning: in training the decoder/generator is optimized to approximate a homeomorphism between the data distribution and the ...
Read More
Learning conditional variational autoencoders with missing covariates
Abstract
Conditional variational autoencoders (CVAEs) are versatile deep latent variable models that extend the standard VAE framework by conditioning the generative model with auxiliary covariates. The original CVAE model assumes that the data samples ...
Graphical abstract

Display Omitted
Highlights
- An improved learning method for conditional VAEs and Gaussian process prior VAEs.
- The method is designed for non-temporal, temporal, and longitudinal data.
- Used an amortised variational distribution for learning missing auxiliary ...
Read More
Adaptive Denoising Autoencoders: A Fine-Tuning Scheme to Learn from Test Mixtures
LVA/ICA 2015: Proceedings of the 12th International Conference on Latent Variable Analysis and Signal Separation - Volume 9237

This work aims at a test-time fine-tune scheme to further improve the performance of an already-trained Denoising AutoEncoder DAE in the context of semi-supervised audio source separation. Although the state-of-the-art deep learning-based DAEs show ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

CVMP '17: Proceedings of the 14th European Conference on Visual Media Production (CVMP 2017)
December 2017
93 pages
ISBN:9781450353298
DOI:10.1145/3150165

Copyright © 2017 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 11 December 2017
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Deep Neural Networks
Faces
Generative Models
Qualifiers
- research-article
- Research
- Refereed limited
Conference

Acceptance Rates
CVMP '17 Paper Acceptance Rate10of16submissions,63%Overall Acceptance Rate40of67submissions,60%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 6
  Total Citations
  View Citations
- 304
  Total Downloads
- Downloads (Last 12 months)17
- Downloads (Last 6 weeks)2
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Laplacian Pyramid of Conditional Variational Autoencoders

CVMP '17: Proceedings of the 14th European Conference on Visual Media Production (CVMP 2017)

ABSTRACT

References

Cited By

Index Terms

Recommendations

Variational Diffusion Autoencoders with Random Walk Sampling

Learning conditional variational autoencoders with missing covariates

Adaptive Denoising Autoencoders: A Fine-Tuning Scheme to Learn from Test Mixtures

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Laplacian Pyramid of Conditional Variational Autoencoders

CVMP '17: Proceedings of the 14th European Conference on Visual Media Production (CVMP 2017)

ABSTRACT

References

Cited By

Index Terms

Recommendations

Variational Diffusion Autoencoders with Random Walk Sampling

Learning conditional variational autoencoders with missing covariates

Adaptive Denoising Autoencoders: A Fine-Tuning Scheme to Learn from Test Mixtures

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media