EG3D: Efficient Geometry-aware 3D Generative Adversarial Networks | CVPR 2022

Eric R. Chan*, Connor Z. Lin*, Matthew A. Chan*, Koki Nagano*, Boxiao Pan, Shalini De Mello, Orazio Gallo, Leonidas Guibas, Jonathan Tremblay, Sameh Khamis, Tero Karras, Gordon Wetzstein

3D GAN for photorealistic, multiview consistent, and shape-aware image synthesis.

4 min Tech Talk

ABSTRACT

Unsupervised generation of high-quality multi-view-consistent images and 3D shapes using only collections of single-view 2D photographs has been a long-standing challenge. Existing 3D GANs are either compute-intensive or make approximations that are not 3D-consistent; the former limits quality and resolution of the generated images and the latter adversely affects multi-view consistency and shape quality. In this work, we improve the computational efficiency and image quality of 3D GANs without overly relying on these approximations. For this purpose, we introduce an expressive hybrid explicit-implicit network architecture that, together with other design choices, synthesizes not only high-resolution multi-view-consistent images in real time but also produces high-quality 3D geometry. By decoupling feature generation and neural rendering, our framework is able to leverage state-of-the-art 2D CNN generators, such as StyleGAN2, and inherit their efficiency and expressiveness. We demonstrate state-of-the-art 3D-aware synthesis with FFHQ and AFHQ Cats, among other experiments.

FILES

 

CITATION

E.R. Chan*, C.Z. Lin*, M.A. Chan*, K. Nagano*, B. Pan, S. De Mello, O. Gallo, L. Guibas, J. Tremblay, S. Khamis, T. Karras, G. Wetzstein, Efficient Geometry-aware 3D Generative Adversarial Networks, CVPR 2022

@inproceedings{Chan2022,
author = {Eric R. Chan and Connor Z. Lin and Matthew A. Chan and Koki Nagano and Boxiao Pan and Shalini De Mello and Orazio Gallo and Leonidas Guibas and Jonathan Tremblay and Sameh Khamis and Tero Karras and Gordon Wetzstein},
title = {{Efficient Geometry-aware 3D Generative Adversarial Networks}},
booktitle = {CVPR},
year = {2022}
}

Tri-plane Representation

Training a GAN with neural rendering is expensive, so we use a hybrid explicit-implicit 3D representation in order to make neural rendering as efficient as possible. Our representation combines an explicit backbone, which produces features aligned on three orthogonal planes, with a small implicit decoder. Compared to a typical multilayer perceptron representation, our 3D representation is more than seven times faster and uses less than one sixteenth as much memory. In using StyleGAN2 as the backbone of our representation, we inherit the qualities of the backbone, including a well-behaved latent space.

Qualitative Results

The following videos demonstrate scene synthesis with our method, which produces both high-quality, multi-view-consistent renderings and detailed geometry.


Color video renderings of scenes produced by our method, created by moving the camera along a path while fixing the latent code that controls the scene.

Renderings of surfaces generated by our method, which are obtained from the density field of our 3D representation with isosurface extraction.

Additional results for generated faces

Interpolation

Our method inherits the qualities of the StyleGAN2 backbone, including a well-behaved latent space. The following video shows interpolation between selected points in FFHQ.


Interpolations between latent vectors with FFHQ.

GAN Inversion

We apply the prior over 3D faces learned by our method to single-image 3D reconstruction. We use Pivotal Tuning Inversion to invert test images and recover 3D shapes and novel views.


Single image 3D reconstruction using Pivotal Tuning Inversion. Input image (left) and reconstruction (right).

Realtime Demonstration

An efficient architecture enables scene synthesis and rendering at at real-time framerates, opening the door for many exciting interactive applications.


A demonstration of our method synthesizing and rendering scenes in real-time.

Acknowledgements

We thank David Luebke, Jan Kautz, Jaewoo Seo, Jonathan Granskog, Simon Yuen, Alex Evans, Stan Birchfield, Alexander Bergman, and Joy Hsu for reviewing early drafts and for the helpful suggestions and feedback. We thank Alex Chan, Giap Nguyen, and Trevor Chan for help with figures and diagrams. Koki Nagano and Eric Chan were partially supported by DARPA’s Semantic Forensics (SemaFor) contract (HR0011-20-3-0005). The views and conclusions contained in this document are those of the authors and should not be interpreted as representing the official policies, either expressed or implied, of the U.S. Government. Distribution Statement “A” (Approved for Public Release, Distribution Unlimited). We base this website off of the StyleGAN3 website template.

Related Projects

You may also be interested in related projects on 2D and 3D GANs, such as :

  • Karras et al. StyleGAN, 2019 (link)
  • Karras et al. StyleGAN2, 2020 (link)
  • Karras et al. StyleGAN3, 2021 (link)
  • Chan et al. pi-GAN. CVPR 2021 (link)

or related projects focusing on neural scene representations and rendering from our group:

  • Lindell et al. BACON. 2021 (link)
  • Martel et al. ACORN. SIGGRAPH 2021 (link)
  • Kellnhofer et al. Neural Lumigraph Rendering. CVPR 2021 (link)
  • Lindell et al. AutoInt. CVPR 2021 (link)
  • Sitzmann et al. Implicit Neural Representations with Periodic Activation Functions. NeurIPS 2020 (link)
  • Sitzmann et al. MetaSDF. NeurIPS 2020 (link)
  • Sitzmann et al. Scene Representation Networks. NeurIPS 2019 (link)
  • Sitzmann et al. Deep Voxels. CVPR 2019 (link)