Enhancing Monocular 3D Scene Completion with Diffusion Model

编辑：广东客 | 分类：CV | 2025年6月12日

Note: We don't have the ability to review paper

PubDate: Mar 2025

Teams:Australian National University

Writers:Changlin Song, Jiaqi Wang, Liyun Zhu, He Weng

PDF:Enhancing Monocular 3D Scene Completion with Diffusion Model

Abstract

3D scene reconstruction is essential for applications in virtual reality, robotics, and autonomous driving, enabling machines to understand and interact with complex environments. Traditional 3D Gaussian Splatting techniques rely on images captured from multiple viewpoints to achieve optimal performance, but this dependence limits their use in scenarios where only a single image is available. In this work, we introduce FlashDreamer, a novel approach for reconstructing a complete 3D scene from a single image, significantly reducing the need for multi-view inputs. Our approach leverages a pre-trained vision-language model to generate descriptive prompts for the scene, guiding a diffusion model to produce images from various perspectives, which are then fused to form a cohesive 3D reconstruction. Extensive experiments show that our method effectively and robustly expands single-image inputs into a comprehensive 3D scene, extending monocular 3D reconstruction capabilities without further training. Our code is available this https URL.

本文链接：https://paper.nweon.com/16362

Enhancing Monocular 3D Scene Completion with Diffusion Model

您可能还喜欢...

最新AR/VR行业分享

最新AR/VR专利

最新AR/VR行业招聘

Enhancing Monocular 3D Scene Completion with Diffusion Model

您可能还喜欢...

Language-Driven Interactive Shadow Detection

3DTextureTransformer: Geometry Aware Texture Generation for Arbitrary Mesh Topology

Adversary-Guided Motion Retargeting for Skeleton Anonymizatio

最新AR/VR行业分享

最新AR/VR专利

最新AR/VR行业招聘