Abstract
Automatic generation of artistic glyph images is a challenging task that attracts many research interests. Previous methods either are specifically designed for shape synthesis or focus on texture transfer. In this paper, we propose a novel model, AGIS-Net, to transfer both shape and texture styles in one-stage with only a few stylized samples. To achieve this goal, we first disentangle the representations for content and style by using two encoders, ensuring the multi-content and multi-style generation. Then we utilize two collaboratively working decoders to generate the glyph shape image and its texture image simultaneously. In addition, we introduce a local texture refinement loss to further improve the quality of the synthesized textures. In this manner, our one-stage model is much more efficient and effective than other multi-stage stacked methods. We also propose a large-scale dataset with Chinese glyph images in various shape and texture styles, rendered from 35 professional-designed artistic fonts with 7,326 characters and 2,460 synthetic artistic fonts with 639 characters, to validate the effectiveness and extendability of our method. Extensive experiments on both English and Chinese artistic glyph image datasets demonstrate the superiority of our model in generating high-quality stylized glyph images against other state-of-the-art methods.
Supplemental Material
Available for Download
Supplemental files.
- Samaneh Azadi, Matthew Fisher, Vladimir G Kim, Zhaowen Wang, Eli Shechtman, and Trevor Darrell. 2018. Multi-content gan for few-shot font style transfer. In CVPR.Google Scholar
- Elena Balashova, Amit Bermano, Vladimir G. Kim, Stephen DiVerdi, Aaron Hertzmann, and Thomas A. Funkhouser. 2019. Learning A Stroke-Based Representation for Fonts. Comput. Graph. Forum (2019).Google Scholar
- Shumeet Baluja. 2016. Learning typographic style. ArXiv (2016).Google Scholar
- Neill DF Campbell and Jan Kautz. 2014. Learning a manifold of fonts. ACM Transactions on Graphics (TOG) 33, 4 (2014).Google ScholarDigital Library
- Xi Chen, Yan Duan, Rein Houthooft, John Schulman, Ilya Sutskever, and Pieter Abbeel. 2016. Infogan: Interpretable representation learning by information maximizing generative adversarial nets. In Advances in NeurIPS.Google Scholar
- Louis Clouâtre and Marc Demers. 2019. FIGR: Few-shot Image Generation with Reptile. ArXiv (2019).Google Scholar
- Vincent Dumoulin and Francesco Visin. 2016. A guide to convolution arithmetic for deep learning. ArXiv (2016).Google Scholar
- Leon A Gatys, Alexander S Ecker, and Matthias Bethge. 2015. A neural algorithm of artistic style. ArXiv (2015).Google Scholar
- Abel Gonzalez-Garcia, Joost van de Weijer, and Yoshua Bengio. 2018. Image-to-image translation for cross-domain disentanglement. ArXiv (2018).Google Scholar
- Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative adversarial nets. In Advances in NeurIPS.Google Scholar
- Shuyang Gu, Congliang Chen, Jing Liao, and Lu Yuan. 2018. Arbitrary style transfer with deep feature reshuffle. In CVPR.Google Scholar
- Yuan Guo, Zhouhui Lian, Yingmin Tang, and Jianguo Xiao. 2018. Creating New Chinese Fonts based on Manifold Learning and Adversarial Networks. In EG 2018.Google Scholar
- Martin Heusel, Hubert Ramsauer, Thomas Unterthiner, Bernhard Nessler, and Sepp Hochreiter. 2017. Gans trained by a two time-scale update rule converge to a local nash equilibrium. In Advances in NeurIPS.Google Scholar
- Xun Huang and Serge Belongie. 2017. Arbitrary style transfer in real-time with adaptive instance normalization. In IEEE ICCV.Google Scholar
- Satoshi Iizuka, Edgar Simo-Serra, and Hiroshi Ishikawa. 2017. Globally and locally consistent image completion. ACM Trans. Graph. (2017).Google Scholar
- Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A Efros. 2017. Image-to-image translation with conditional adversarial networks. In CVPR.Google Scholar
- Yue Jiang, Zhouhui Lian, Yingmin Tang, and Jianguo Xiao. 2017. DCFont: an end-to-end deep Chinese font generation system. In SIGGRAPH Asia 2017 Technical Briefs. ACM.Google Scholar
- Yue Jiang, Zhouhui Lian, Yingmin Tang, and Jianguo Xiao. 2019. SCFont: Structure-guided Chinese Font Generation via Deep Stacked Networks. (2019).Google Scholar
- Justin Johnson, Alexandre Alahi, and Li Fei-Fei. 2016. Perceptual losses for real-time style transfer and super-resolution. In ECCV. Springer.Google Scholar
- Hadi Kazemi, Seyed Mehdi Iranmanesh, and Nasser M. Nasrabadi. 2019. Style and Content Disentanglement in Generative Adversarial Networks. IEEE WACV (2019).Google Scholar
- Diederik P Kingma and Max Welling. 2013. Auto-encoding variational bayes. ArXiv (2013).Google Scholar
- Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2012. Imagenet classification with deep convolutional neural networks. In Advances in NeurIPS.Google Scholar
- Brenden M. Lake, Ruslan Salakhutdinov, and Joshua B. Tenenbaum. 2015. Human-level concept learning through probabilistic program induction. Science (2015).Google Scholar
- Hsin-Ying Lee, Hung-Yu Tseng, Jia-Bin Huang, Maneesh Singh, and Ming-Hsuan Yang. 2018. Diverse image-to-image translation via disentangled representations. In ECCV. 35--51.Google Scholar
- Yijun Li, Chen Fang, Jimei Yang, Zhaowen Wang, Xin Lu, and Ming-Hsuan Yang. 2017. Universal style transfer via feature transforms. In Advances in NeurIPS.Google Scholar
- Zhouhui Lian, Bo Zhao, Xudong Chen, and Jianguo Xiao. 2018. EasyFont: A Style Learning-Based System to Easily Build Your Large-Scale Handwriting Fonts. ACM TOG 38, 1 (2018).Google Scholar
- Ming-Yu Liu, Xun Huang, Arun Mallya, Tero Karras, Timo Aila, Jaakko Lehtinen, and Jan Kautz. 2019. Few-Shot Unsupervised Image-to-Image Translation. ArXiv (2019).Google Scholar
- Pengyuan Lyu, Xiang Bai, Cong Yao, Zhen Zhu, Tengteng Huang, and Wenyu Liu. 2017. Auto-encoder guided gan for chinese calligraphy synthesis. In ICDAR.Google Scholar
- Qi Mao, Hsin-Ying Lee, Hung-Yu Tseng, Siwei Ma, and Ming-Hsuan Yang. 2019. Mode Seeking Generative Adversarial Networks for Diverse Image Synthesis. In CVPR.Google Scholar
- Roey Mechrez, Itamar Talmi, and Lihi Zelnik-Manor. 2018. The contextual loss for image transformation with non-aligned data. In ECCV.Google Scholar
- Yifang Men, Zhouhui Lian, Yingmin Tang, and Jianguo Xiao. 2018. A Common Framework for Interactive Texture Transfer. In CVPR.Google Scholar
- Vinod Nair and Geoffrey E Hinton. 2010. Rectified linear units improve restricted boltzmann machines. In ICML. 807--814.Google Scholar
- Augustus Odena, Christopher Olah, and Jonathon Shlens. 2017. Conditional image synthesis with auxiliary classifier gans. In 34th ICML.Google Scholar
- Alec Radford, Luke Metz, and Soumith Chintala. 2015. Unsupervised representation learning with deep convolutional generative adversarial networks. ArXiv (2015).Google Scholar
- Tim Salimans, Ian Goodfellow, Wojciech Zaremba, Vicki Cheung, Alec Radford, and Xi Chen. 2016. Improved techniques for training gans. In NeurIPS.Google Scholar
- Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. ArXiv (2014).Google Scholar
- Joshua B. Tenenbaum and William T. Freeman. 2000. Separating Style and Content with Bilinear Models. Neural Computation (2000).Google Scholar
- Dmitry Ulyanov, Andrea Vedaldi, and Victor Lempitsky. 2016. Instance normalization: The missing ingredient for fast stylization. ArXiv (2016).Google Scholar
- Paul Upchurch, Noah Snavely, and Kavita Bala. 2016. From A to Z: Supervised Transfer of Style and Content Using Deep Neural Network Generators. ArXiv (2016).Google Scholar
- Xiaolong Wang and Abhinav Gupta. 2016. Generative Image Modeling using Style and Structure Adversarial Networks. ArXiv (2016).Google Scholar
- Shuai Yang, Jiaying Liu, Zhouhui Lian, and Zongming Guo. 2017. Awesome typography: Statistics-based text effects transfer. In CVPR.Google Scholar
- Shuai Yang, Jiaying Liu, Wenjing Wang, and Zongming Guo. 2019. TET-GAN: Text Effects Transfer via Stylization and Destylization. In AAAI Conference on AI.Google ScholarDigital Library
- Yexun Zhang, Ya Zhang, and Wenbin Cai. 2018. Separating style and content for generalized style transfer. In CVPR.Google Scholar
- Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A Efros. 2017a. Unpaired image-to-image translation using cycle-consistent adversarial networks. In ICCV.Google Scholar
- Jun-Yan Zhu, Richard Zhang, Deepak Pathak, Trevor Darrell, Alexei A Effros, Oliver Wang, and Eli Shechtman. 2017b. Toward multimodal image-to-image translation. In Advances in NeurIPS.Google Scholar
Index Terms
- Artistic glyph image synthesis via one-stage few-shot learning
Recommendations
ZiGAN: Fine-grained Chinese Calligraphy Font Generation via a Few-shot Style Transfer Approach
MM '21: Proceedings of the 29th ACM International Conference on MultimediaChinese character style transfer is a very challenging problem because of the complexity of the glyph shapes or underlying structures and large numbers of existed characters, when comparing with English letters. Moreover, the handwriting of calligraphy ...
SKFont: skeleton-driven Korean font generator with conditional deep adversarial networks
AbstractIn our research, we study the problem of font synthesis using an end-to-end conditional deep adversarial network with a small sample of Korean characters (Hangul). Hangul comprises of 11,172 characters and is composed by writing in multiple ...
Unpaired font family synthesis using conditional generative adversarial networks▪
AbstractAutomatic font image synthesis has been an extremely active topic in recent years. Various deep learning-based approaches have been proposed to tackle this font synthesis task by considering it as an image-to-image translation problem ...
Comments