OCTOPUS: Open-vocabulary Content Tracking and Object Placement Using Semantic Understanding in Mixed Reality
PubDate: Dec 2023
Teams:University of Chinese Academy of Sciences；Chinese Academy of Sciences；Alibaba Group；Nanjing University
Writers:Luke Yoffe, Aditya Sharma, Tobias Höllerer
One key challenge in augmented reality is the placement of virtual content in natural locations. Existing automated techniques are only able to work with a closed-vocabulary, fixed set of objects. In this paper, we introduce a new open-vocabulary method for object placement. Our eight-stage pipeline leverages recent advances in segmentation models, vision-language models, and LLMs to place any virtual object in any AR camera frame or scene. In a preliminary user study, we show that our method performs at least as well as human experts 57% of the time.