Abstract
A future video is the 2D projection of a 3D scene with predicted camera and object motion. Accurate future video prediction inherently requires understanding of 3D motion and geometry of a scene. In this paper, we propose a RGBD scene forecasting model with 3D motion decomposition. We predict ego-motion and foreground motion that are combined to generate a future 3D dynamic scene, which is then projected into a 2D image plane to synthesize future motion, RGB images and depth maps. Optional semantic maps can be integrated. Experimental results on KITTI and Driving datasets show that our model outperforms other state-of-the-arts in forecasting future RGBD dynamic scenes.
| Original language | English |
|---|---|
| Title of host publication | Proceedings - 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2019 |
| Publisher | IEEE Computer Society |
| Pages | 7665-7674 |
| Number of pages | 10 |
| ISBN (Electronic) | 9781728132938 |
| DOIs | |
| Publication status | Published - Jun 2019 |
| Event | 32nd IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2019 - Long Beach, United States Duration: 16 Jun 2019 → 20 Jun 2019 |
Publication series
| Name | Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition |
|---|---|
| Volume | 2019-June |
| ISSN (Print) | 1063-6919 |
Conference
| Conference | 32nd IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2019 |
|---|---|
| Country/Territory | United States |
| City | Long Beach |
| Period | 16/06/19 → 20/06/19 |
Bibliographical note
Publisher Copyright:© 2019 IEEE.
Keywords
- Image and Video Synthesis
- RGBD sensors and analytics