An End-to-End Transformer Model for 3D Object Detection

编辑：映维 | 分类：CV / XR | 2021年10月26日

Note: We don't have the ability to review paper

PubDate: Sep 2021

Teams: Facebook AI Research

Writers: Ishan Misra, Rohit Girdhar, Armand Joulin

PDF: An End-to-End Transformer Model for 3D Object Detection

An End-to-End Transformer Model for 3D Object Detection

Abstract

We propose 3DETR, an end-to-end Transformer based object detection model for 3D point clouds. Compared to existing detection methods that employ a number of 3D-specific inductive biases, 3DETR requires minimal modifications to the vanilla Transformer block. Specifically, we find that a standard Transformer with non-parametric queries and Fourier positional embeddings is competitive with specialized architectures that employ libraries of 3D-specific operators with hand-tuned hyperparameters. Nevertheless, 3DETR is conceptually simple and easy to implement, enabling further improvements by incorporating 3D domain knowledge. Through extensive experiments, we show 3DETR outperforms the well-established and highly optimized VoteNet baselines on the challenging ScanNetV2 dataset by 9.5%. Furthermore, we show 3DETR is applicable to 3D tasks beyond detection, and can serve as a building block for future research.

本文链接：https://paper.nweon.com/11346

An End-to-End Transformer Model for 3D Object Detection

您可能还喜欢...

最新AR/VR行业分享

最新AR/VR专利

最新AR/VR行业招聘

An End-to-End Transformer Model for 3D Object Detection

您可能还喜欢...

Comparing human-robot proxemics between virtual reality and the real world

Spatial Computing and Intuitive Interaction: Bringing Mixed Reality and Robotics Together

Stiffness in Virtual Contact Events: A Non-Parametric Bayesian Approach

最新AR/VR行业分享

最新AR/VR专利

最新AR/VR行业招聘