LCD: Learned Cross-domain Descriptors for 2D-3D Matching

Quang-Hieu Pham1   Mikaela Angelina Uy2   Binh-Son Hua3   Duc Thanh Nguyen4  
Gemma Roig5   Sai-Kit Yeung6  

1Singapore University of Technology and Design   2Stanford University   3The University of Tokyo  
4Deadkin University    5Geothe University of Frankfrut   6Hong Kong University of Science and Technology

AAAI Conference on Artificial Intelligence (AAAI), 2020. Oral.



In this work, we present a novel method to learn a local cross-domain descriptor for 2D image and 3D point cloud matching. Our proposed method is a dual auto-encoder neural network that maps 2D and 3D input into a shared latent space representation. We show that such local cross-domain descriptors in the shared embedding are more discriminative than those obtained from individual training in 2D and 3D domains. To facilitate the training process, we built a new dataset by collecting ≈1.4 millions of 2D-3D correspondences with various lighting conditions and settings from publicly available RGB-D scenes. Our descriptor is evaluated in three main experiments: 2D-3D matching, cross-domain retrieval, and sparse-to-dense depth estimation. Experimental results confirm the robustness of our approach as well as its competitive performance not only in solving cross-domain tasks but also in being able to generalize to solve sole 2D and 3D tasks.


  title = {{LCD}: {L}earned cross-domain descriptors for 2{D}-3{D} matching},
  author = {Pham, Quang-Hieu and Uy, Mikaela Angelina and Hua, Binh-Son and Nguyen, Duc Thanh and Roig, Gemma and Yeung, Sai-Kit},
  booktitle = {AAAI Conference on Artificial Intelligence (AAAI)},
  year = 2020


This research project is partially supported by an internal grant from HKUST (R9429).