Semantic Segmentation-Guided 3D Shape Reconstruction of Indoor Scenes Using a PointNet-Based Autoencoder
Open Access
Article
Conference Proceedings
Authors: Takahiro Miki, Yusuke Osawa, Keiichi Watanuki
Abstract: This study aims to automatically construct virtual spaces that faithfully reflect the geometry and object arrangement in real-world environments. As a first step, we proposed a method for the three-dimensional (3D) shape reconstruction of indoor scenes using a PointNet-based autoencoder guided by semantic information. The proposed method first segmented a 3D point cloud into semantic classes and then applied a separately trained autoencoder to each class. To validate its effectiveness, we used the ScanNet++ indoor scene dataset and our own real-world data captured using a 3D scanner, performing qualitative visual comparisons and quantitative evaluations using metrics such as Chamfer distance (CD) and Earth mover’s distance (EMD). The results demonstrated that the proposed method achieved high visual fidelity and low CD error (4.23 × 10⁻⁴) on validation data similar to the training set. Although point scattering was observed in the unseen test data, the reconstruction fidelity still showed a clear improvement over prior work. Furthermore, we analyzed the counterintuitive observation that EMD showed an opposite trend to CD and showed that this was a statistical effect arising from the difference in the number of instances used for evaluation. A potential application of this method was also identified: by limiting the target classes, furniture could be intentionally excluded and only the skeletal structure of the space could be reconstructed. Future work will explore enhancing the local feature representation by adding normal information as an input feature and improving robustness through post-segmentation noise removal.
Keywords: 3D Point Clouds, Semantic Segmentation, PointNet, Autoencoder, Virtual Space
DOI: 10.54941/ahfe1006913
Cite this paper:
Downloads
16
Visits
43


AHFE Open Access