Skip to main navigation Skip to search Skip to main content

Shadow removal and object proposal generation for RGB-D images

  • Yao XIAO

Student thesis: Doctoral thesis

Abstract

In recent years, the emergence of reliable and low-cost RGB-D sensors (e.g., Microsoft Kinect) has expended the dimension of a single image from 2D to 3D. With the aiding of additional depth channel the 3D spatial information is discovered to compensate 2D image plane, in which way many computer vision tasks are boosted. In this thesis, we present the research of extending two vision tasks, shadow removal and object proposal generation, from RGB to a single RGB-D image. Shadow removal is a classical and challenging computer vision problem. First we propose an automatic method to remove shadows from single RGB-D images. Using normal cues directly derived from depth, we can remove both hard and soft shadows while preserving surface texture and shading. Our key assumption is: pixels with similar normals, spatial locations and chromaticity should have similar colors. A modified nonlocal matching is used to compute a shadow confidence map that localizes well hard shadow boundary, thus handling hard and soft shadows within the same framework. Then the detected shadows will be removed by a constrained linear optimization to reconstruct a shadow-less image. We compare our results produced using state-of-the-art shadow removal on single RGB images, and intrinsic image decomposition on standard RGB-D datasets. Our second task is to generate object proposals from RGB-D images. But before that we present a novel method to produce proposals for 2D images. Object proposals are the potential object candidates in the detection pipeline. Besides, distance metric plays a key role in grouping superpixels to produce the proposals for object detection. We observe that existing distance metrics work primarily for low complexity cases. In this thesis, we develop a novel distance metric for grouping two superpixels in high-complexity scenarios. Combining them, a complexity-adaptive distance measure is produced that achieves improved grouping in different levels of complexity. Our extensive experimentation shows that our method can achieve good results in the PASCAL VOC 2012 dataset surpassing the latest state-of-the-art methods. Next we focus on the task of extracting 3D region proposals from indoor RGB-D images, which aims to produce bounding boxes of candidate objects. 3D voxel grid contains large amount of redundant space. To rule out less-informative voxels and to simplify the problem we introduce a space compression procedure to squash 3D space to 2D ”tile” grid. After each tile is layered in vertical direction individually, we propose Structural Constrained Parametric Min-Cuts (S-CPMC) to group the tilted space. The extracted tiles are further processed to reconstruct 3D bounding boxes through geodesic distance transformation (GDT) from the generated tile hypotheses. Finally the object hypotheses are ranked by a trained ranker. Experiments show that our algorithm achieves comparing result to stat-of-the-art on SUN-RGBD dataset.
Date of Award2017
Original languageEnglish
Awarding Institution
  • The Hong Kong University of Science and Technology

Cite this

'