Integration of top-down and bottom-up visual processing using a recurrent convolutional–deconvolutional neural network for semantic segmentation

Byung Wan Kim, Youngbin Park, Il Hong Suh

Research output: Contribution to journalArticle

Abstract

Semantic segmentation has a wide array of applications such as scene understanding, autonomous driving, and robot manipulation tasks. While existing segmentation models have achieved good performance using bottom-up deep neural processing, this paper describes a novel deep learning architecture that integrates top-down and bottom-up processing. The resulting model achieves higher accuracy at a relatively low computational cost. In the proposed model, higher-level top-down information is transmitted to the lower layers through recurrent connections in an encoder and a decoder, and the recurrent connection weights are trained using backpropagation. Experiments on several benchmark datasets demonstrate that this use of top-down information improves the mean intersection over union by more than 3% compared with a state-of-the-art bottom-up only network using the CamVid, SUN-RGBD and PASCAL VOC 2012 benchmark datasets. Additionally, the proposed model is successfully applied to a dataset designed for robotic grasping tasks.

Original languageEnglish
Pages (from-to)87-97
Number of pages11
JournalIntelligent Service Robotics
Volume13
Issue number1
DOIs
Publication statusPublished - 2020 Jan 1

    Fingerprint

Keywords

  • Deep recurrent neural network
  • Semantic segmentation
  • Top-down and bottom-up

Cite this