Abstract
Despite the recent advancement of Generative Adversarial Networks (GANs) in learning 3D-aware image synthesis from 2D data, existing methods fail to model indoor scenes due to the large diversity of room layouts and the objects inside. We argue that indoor scenes do not have a shared intrinsic structure, and hence only using 2D images cannot adequately guide the model with the 3D geometry. In this work, we fill in this gap by introducing depth as a 3D prior (Depth is essentially a 2.5D prior, but in this paper we use 3D for simplicity). Compared with other 3D data formats, depth better fits the convolution-based generation mechanism and is more easily accessible in practice. Specifically, we propose a dual-path generator, where one path is responsible for depth generation, whose intermediate features are injected into the other path as the condition for appearance rendering. Such a design eases the 3D-aware synthesis with explicit geometry information. Meanwhile, we introduce a switchable discriminator both to differentiate real v.s. fake domains and to predict the depth from a given input. In this way, the discriminator can take the spatial arrangement into account and advise the generator to learn an appropriate depth condition. Extensive experimental results suggest that our approach is capable of synthesizing indoor scenes with impressively good quality and 3D consistency, significantly outperforming state-of-the-art alternatives. (Project page can be found here.)
| Original language | English |
|---|---|
| Title of host publication | Computer Vision – ECCV 2022 - 17th European Conference, Proceedings |
| Editors | Shai Avidan, Gabriel Brostow, Moustapha Cissé, Giovanni Maria Farinella, Tal Hassner |
| Publisher | Springer Science and Business Media Deutschland GmbH |
| Pages | 406-422 |
| Number of pages | 17 |
| ISBN (Print) | 9783031197864 |
| DOIs | |
| Publication status | Published - 2022 |
| Event | 17th European Conference on Computer Vision, ECCV 2022 - Tel Aviv, Israel Duration: 23 Oct 2022 → 27 Oct 2022 |
Publication series
| Name | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
|---|---|
| Volume | 13676 LNCS |
| ISSN (Print) | 0302-9743 |
| ISSN (Electronic) | 1611-3349 |
Conference
| Conference | 17th European Conference on Computer Vision, ECCV 2022 |
|---|---|
| Country/Territory | Israel |
| City | Tel Aviv |
| Period | 23/10/22 → 27/10/22 |
Bibliographical note
Publisher Copyright:© 2022, The Author(s), under exclusive license to Springer Nature Switzerland AG.
Keywords
- 3D-aware image synthesis
- Depth priors
- Scene synthesis