Real-time Progressive 3D Semantic Segmentation for Indoor Scenes

Quang-Hieu Pham1     Binh-Son Hua2     Duc Thanh Nguyen3     Sai-Kit Yeung4    

1Singapore University of Technology and Design     2The University of Tokyo    
3Deadkin University     4Hong Kong University of Science and Technology

Winter Conference on Applications of Computer Vision (WACV), 2019.

Overview of our progressive indoor scene segmentation method. From continuous frames of an RGB-D sensor, our system performs on-the-fly reconstruction and semantic segmentation. All of our processing is performed on a frame-by-frame basis in an online fashion, thus useful for real-time applications.


The widespread adoption of autonomous systems such as drones and assistant robots has created a need for real-time high-quality semantic scene segmentation. In this paper, we propose an efficient yet robust technique for on-the-fly dense reconstruction and semantic segmentation of 3D indoor scenes. To guarantee (near) real-time performance, our method is built atop an efficient super-voxel clustering method and a conditional random field with higher-order constraints from structural and object cues, enabling progressive dense semantic segmentation without any precomputation. We extensively evaluate our method on different indoor scenes including kitchens, offices, and bedrooms in the SceneNN and ScanNet datasets and show that our technique consistently produces state-of-the-art segmentation results in both qualitative and quantitative experiments.

Demo video

Comparison between our method and other systems. We compare our method with other state-of-the-art real-time semantic reconstruction systems, i.e. SemanticFusion, and SemanticPaint on SceneNN and ScanNet dataset. Results show that our method outperforms others while still running at 10–15Hz.



  title = {Real-time progressive 3{D} semantic segmentation for indoor scenes},
  author = {Pham, Quang-Hieu and Hua, Binh-Son and Nguyen, Duc Thanh and Yeung, Sai-Kit},
  booktitle = {Winter Conference on Applications of Computer Vision (WACV)},
  year = 2019


This research project is partially supported by an internal grant from HKUST (R9429).