Microsoft 归档 - Nweon Paper

Just stop doing everything for now!: Understanding security attacks in remote collaborative mixed reality

广东客 — Wed, 19 Feb 2025 02:51:16 +0000

PubDate: Jan 2025

Teams:Virginia Tech

Writers:Maha Sajid, Syed Ibrahim Mustafa Shah Bukhari, Bo Ji, Brendan David-John

PDF:Just stop doing everything for now!: Understanding security attacks in remote collaborative mixed reality

Abstract

Mixed Reality (MR) devices are being increasingly adopted across a wide range of real-world applications, ranging from education and healthcare to remote work and entertainment. However, the unique immersive features of MR devices, such as 3D spatial interactions and the encapsulation of virtual objects by invisible elements, introduce new vulnerabilities leading to interaction obstruction and misdirection. We implemented latency, click redirection, object occlusion, and spatial occlusion attacks within a remote collaborative MR platform using the Microsoft HoloLens 2 and evaluated user behavior and mitigations through a user study. We compared responses to MR-specific attacks, which exploit the unique characteristics of remote collaborative immersive environments, and traditional security attacks implemented in MR. Our findings indicate that users generally exhibit lower recognition rates for immersive attacks (e.g., spatial occlusion) compared to attacks inspired by traditional ones (e.g., click redirection). Our results demonstrate a clear gap in user awareness and responses when collaborating remotely in MR environments. Our findings emphasize the importance of training users to recognize potential threats and enhanced security measures to maintain trust in remote collaborative MR systems.

Just stop doing everything for now!: Understanding security attacks in remote collaborative mixed reality最先出现在Nweon Paper。

HoloDevice: Holographic Cross-Device Interactions for Remote Collaboration

广东客 — Tue, 12 Nov 2024 07:40:04 +0000

PubDate: May 2024

Teams: University of Calgary；Microsoft Research

Writers: Neil Chulpongsatorn, Thien-Kim Nguyen, Nicolai Marquardt, Ryo Suzuki

PDF: HoloDevice: Holographic Cross-Device Interactions for Remote Collaboration

Abstract

This paper introduces holographic cross-device interaction, a new class of remote cross-device interactions between local physical devices and holographically rendered remote devices. Cross-device interactions have enabled a rich set of interactions with device ecologies. Most existing research focuses on co-located settings (meaning when users and devices are in the same physical space) to achieve these rich interactions and affordances. In contrast, holographic cross-device interaction allows remote interactions between devices at distant locations by providing a rich visual affordance through real-time holographic rendering of the device’s motion, content, and interactions on mixed reality head-mounted displays. This maintains the advantages of having a physical device, such as precise input through touch and pen interaction. Through holographic rendering, not only can remote devices interact as if they are co-located, but they can also be virtually augmented to further enrich interactions, going beyond what is possible with existing cross-device systems. To demonstrate this concept, we developed HoloDevice, a prototype system for holographic cross-device interaction using the Microsoft Hololens 2 augmented reality headset. Our contribution is threefold. First, we introduce the concept of holographic cross-device interaction. Second, we present a design space containing three unique benefits, which include: (1) spatial visualization of interaction and motion, (2) rich visual affordances for intermediate transition, and (3) dynamic and fluid configuration. Last we discuss a set of implementation demonstrations and use-case scenarios that further explore the space.

HoloDevice: Holographic Cross-Device Interactions for Remote Collaboration最先出现在Nweon Paper。

CMC: Few-shot Novel View Synthesis via Cross-view Multiplane Consistency

映维 — Wed, 07 Aug 2024 07:22:03 +0000

PubDate: Mar 2024

Teams: University of Science and Technology of China；Microsoft Research Asia

Writers: Hanxin Zhu, Tianyu He, Zhibo Chen

PDF: CMC: Few-shot Novel View Synthesis via Cross-view Multiplane Consistency

Abstract

Neural Radiance Field (NeRF) has shown impressive results in novel view synthesis, particularly in Virtual Reality (VR) and Augmented Reality (AR), thanks to its ability to represent scenes continuously. However, when just a few input view images are available, NeRF tends to overfit the given views and thus make the estimated depths of pixels share almost the same value. Unlike previous methods that conduct regularization by introducing complex priors or additional supervisions, we propose a simple yet effective method that explicitly builds depth-aware consistency across input views to tackle this challenge. Our key insight is that by forcing the same spatial points to be sampled repeatedly in different input views, we are able to strengthen the interactions between views and therefore alleviate the overfitting problem. To achieve this, we build the neural networks on layered representations (\textit{i.e.}, multiplane images), and the sampling point can thus be resampled on multiple discrete planes. Furthermore, to regularize the unseen target views, we constrain the rendered colors and depths from different input views to be the same. Although simple, extensive experiments demonstrate that our proposed method can achieve better synthesis quality over state-of-the-art methods.

CMC: Few-shot Novel View Synthesis via Cross-view Multiplane Consistency最先出现在Nweon Paper。

MetaVRadar: Measuring Metaverse Virtual Reality Network Activity

广东客 — Mon, 29 Jul 2024 06:56:02 +0000

PubDate: Feb 2024

Teams: University of New South Wales

Writers: Minzhao Lyu, Rahul Dev Tripathi, Vijay Sivaraman

PDF: MetaVRadar: Measuring Metaverse Virtual Reality Network Activity

Abstract

The “metaverse”, wherein users can enter virtual worlds to work, study, play, shop, socialize, and entertain, is fast becoming a reality, attracting billions of dollars in investment from companies such as Meta, Microsoft, and Clipo Labs. Further, virtual reality (VR) headsets from entities like Oculus, HTC, and Microsoft are rapidly maturing to provide fully immersive experiences to metaverse users. However, little is known about the network dynamics of metaverse VR applications in terms of service domains, flow counts, traffic rates and volumes, content location and latency, etc., which are needed to make telecommunications network infrastructure “metaverse ready”. This paper is an empirical measurement study of metaverse VR network behavior aimed at helping telecommunications network operators better provision and manage the network to ensure good user experience. Using illustrative hour-long network traces of metaverse sessions on the Oculus VR headset, we first develop a categorization of user activity into distinct states ranging from login home to streetwalking and event attendance to asset trading, and undertake a detailed analysis of network traffic per state, identifying unique service domains, protocols, flow profiles, and volumetric patterns, thereby highlighting the vastly more complex nature of a metaverse session compared to streaming video or gaming. Armed with the network behavioral profiles, our second contribution develops a real-time method MetaVRadar to detect metaverse session and classify the user activity state leveraging formalized flow signatures and volumetric attributes. Our third contribution practically implements MetaVRadar, evaluates its accuracy in our lab environment, and demonstrates its usability in a large university network so operators can better monitor and plan resources to support requisite metaverse user experience.

MetaVRadar: Measuring Metaverse Virtual Reality Network Activity最先出现在Nweon Paper。

EgoGen: An Egocentric Synthetic Data Generator

广东客 — Tue, 09 Jul 2024 07:53:42 +0000

Date：June 2024

Teams:ETH Zurich ；Microsoft

Writers:Gen Li, Kaifeng Zhao, Siwei Zhang, Xiaozhong Lyu, Mihai Dusmanu, Yan Zhang, Marc Pollefeys, Siyu Tang

PDF:EgoGen: An Egocentric Synthetic Data Generator

Abstract

Understanding the world in first-person view is fundamental in Augmented Reality (AR). This immersive perspective brings dramatic visual changes and unique challenges compared to third-person views. Synthetic data has empowered third-person-view vision models, but its application to embodied egocentric perception tasks remains largely unexplored. A critical challenge lies in simulating natural human movements and behaviors that effectively steer the embodied cameras to capture a faithful egocentric representation of the 3D world. To address this challenge, we introduce EgoGen, a new synthetic data generator that can produce accurate and rich ground-truth training data for egocentric perception tasks. At the heart of EgoGen is a novel human motion synthesis model that directly leverages egocentric visual inputs of a virtual human to sense the 3D environment. Combined with collision-avoiding motion primitives and a two-stage reinforcement learning approach, our motion synthesis model offers a closed-loop solution where the embodied perception and movement of the virtual human are seamlessly coupled. Compared to previous works, our model eliminates the need for a pre-defined global path, and is directly applicable to dynamic environments. Combined with our easy-to-use and scalable data generation pipeline, we demonstrate EgoGen's efficacy in three tasks: mapping and localization for head-mounted cameras, egocentric camera tracking, and human mesh recovery from egocentric views. EgoGen will be fully open-sourced, offering a practical solution for creating realistic egocentric training data and aiming to serve as a useful tool for egocentric computer vision research. Refer to our project page:this https URL.

EgoGen: An Egocentric Synthetic Data Generator最先出现在Nweon Paper。

Stick-To-XR: Understanding Stick-Based User Interface Design for Extended Reality

广东客 — Mon, 08 Jul 2024 07:52:47 +0000

Date：June 2024

Teams:Microsoft；Xi'an Jiaotong-Liverpool University

Writers:Yaying Zhang, Rongkai Shi, and Hai-Ning Liang

PDF:Stick-To-XR: Understanding Stick-Based User Interface Design for Extended Reality

Abstract

This work explores the design of stick-shaped Tangible User Interfaces (TUI) for Extended Reality (XR). While sticks are widely used in everyday objects, their applications as a TUI in XR have not been systematically studied. We conducted a participatory design session with twelve experts in XR and human-computer interaction to investigate the affordances of stick-based objects and how to utilize them in XR. The results led us to develop a taxonomy of stick-based objects’ affordances in terms of their functions and holding gestures. Following that, we proposed four types of stick-based XR controller forms and discussed their advantages and limitations. In the end, we juxtaposed twenty-six existing XR controllers against our proposed forms and identified Landed (Cane) Stick, Thin Stick’s flexible usages, and Modular Design as the major opportunities that remain unexamined yet for stick-based XR TUI design.

Stick-To-XR: Understanding Stick-Based User Interface Design for Extended Reality最先出现在Nweon Paper。

Designing Stick-Based Extended Reality Controllers: A Participatory Approach

广东客 — Tue, 25 Jun 2024 07:39:13 +0000

Date：May 2024

Teams:Microsoft；Xi'an Jiaotong-Liverpool University

Writers:Yaying Zhang, Rongkai Shi, and Hai-Ning Liang

PDF:Designing Stick-Based Extended Reality Controllers: A Participatory Approach

Abstract

This work explores the design of stick-shaped tangible user interfaces (TUI) for Extended Reality (XR). While sticks are widely used in everyday objects, their applications as a TUI in XR have not been systematically studied. We conducted a participatory design session with twelve experts in XR and HCI to investigate the affordances of stick-based objects and how to utilize them in XR. As a result, we present a taxonomy of stick-based objects’ affordances and propose three types of stick-based XR controllers and their dynamic variations. The paper discusses design considerations for selecting the appropriate stick-based form in XR TUI design.

Designing Stick-Based Extended Reality Controllers: A Participatory Approach最先出现在Nweon Paper。

Virtual Games, Real Interactions: A Look at Cross-reality Asymmetrical Co-located Social Games

广东客 — Tue, 25 Jun 2024 06:54:15 +0000

Date：May 2024

Teams: Simon Fraser University；University of Georgia；University of California Santa Cruz；Microsoft Research；Google

Writers:Alexandra Kitson, Sun Joo (Grace) Ahn, Eric J Gonzalez, Payod Panda, Katherine Isbister, Mar Gonzalez-Franco Authors Info & Claims

PDF:Virtual Games, Real Interactions: A Look at Cross-reality Asymmetrical Co-located Social Games

Abstract

This study examines the feasibility of user-applied active locomotion in In-Car Virtual Reality (VR), overcoming the passivity in mobility of previous In-Car VR experiences where the virtual movement was synchronized with the real movement of the car. We present the concept of virtual steering gains to quantify the magnitude of user-applied redirection from the real car’s path. Through a user study where participants applied various levels of steering gains in an active virtual driving task, we assessed usability factors through measures of motion sickness, spatial presence, and overall acceptance. Results indicate a range of acceptable steering gains in which active locomotion improves spatial presence without significantly increasing motion sickness. Future works will attempt to further validate a steering gain threshold in which active locomotion in In-Car VR can be applicable.

Virtual Games, Real Interactions: A Look at Cross-reality Asymmetrical Co-located Social Games最先出现在Nweon Paper。

EgoGen: An Egocentric Synthetic Data Generator

广东客 — Tue, 18 Jun 2024 05:15:01 +0000

Date：May 2024

Teams:ETH Zurich ；Microsoft

Writers:Gen Li,Kaifeng Zhao,Siwei Zhang,Xiaozhong Lyu,Mihai Dusmanu,Yan Zhang,Marc Pollefeys

PDF:EgoGen: An Egocentric Synthetic Data Generator

Abstract

Understanding the world in first-person view is fundamental in Augmented Reality (AR). This immersive perspective brings dramatic visual changes and unique challenges compared to third-person views. Synthetic data has empowered third-person-view vision models, but its application to embodied egocentric perception tasks remains largely unexplored. A critical challenge lies in simulating natural human movements and behaviors that effectively steer the embodied cameras to capture a faithful egocentric representation of the 3D world. To address this challenge, we introduce EgoGen, a new synthetic data generator that can produce accurate and rich ground-truth training data for egocentric perception tasks. At the heart of EgoGen is a novel human motion synthesis model that directly leverages egocentric visual inputs of a virtual human to sense the 3D environment. Combined with collision-avoiding motion primitives and a two-stage reinforcement learning approach, our motion synthesis model offers a closed-loop solution where the embodied perception and movement of the virtual human are seamlessly coupled. Compared to previous works, our model eliminates the need for a pre-defined global path, and is directly applicable to dynamic environments. Combined with our easy-to-use and scalable data generation pipeline, we demonstrate EgoGen’s efficacy in three tasks: mapping and localization for head-mounted cameras, egocentric camera tracking, and human mesh recovery from egocentric views. EgoGen will be fully open-sourced, offering a practical solution for creating realistic egocentric training data and aiming to serve as a useful tool for egocentric computer vision research. Refer to our project page:this https URL.

EgoGen: An Egocentric Synthetic Data Generator最先出现在Nweon Paper。

Big or Small, It’s All in Your Head: Visuo-Haptic Illusion of Size-Change Using Finger-Repositioning

广东客 — Tue, 11 Jun 2024 07:38:21 +0000

Date：May 2024

Teams:Korea Advanced Institute of Science and Technology；Microsoft Research；University of Washington

Writers:Myung Jin Kim,Eyal Ofek,Michel Pahud,Mike J Sinclair,Andrea Bianchi

PDF:Big or Small, It’s All in Your Head: Visuo-Haptic Illusion of Size-Change Using Finger-Repositioning

Abstract

Haptic perception of physical sizes increases the realism and immersion in Virtual Reality (VR). Prior work rendered sizes by exerting pressure on the user’s fingertips or employing tangible, shape-changing devices. These interfaces are constrained by the physical shapes they can assume, making it challenging to simulate objects growing larger or smaller than the perceived size of the interface. Motivated by literature on pseudo-haptics describing the strong influence of visuals over haptic perception, this work investigates modulating the perception of size beyond this range. We developed a fixed-sized VR controller leveraging finger-repositioning to create a visuo-haptic illusion of dynamic size-change of handheld virtual objects. Through two user studies, we found that with an accompanying size-changing visual context, users can perceive virtual object sizes up to 44.2% smaller to 160.4% larger than the perceived size of the device. Without the accompanying visuals, a constant size (141.4% of device size) was perceived.

Big or Small, It’s All in Your Head: Visuo-Haptic Illusion of Size-Change Using Finger-Repositioning最先出现在Nweon Paper。

Robust Object Pose Tracking for Augmented Reality Guidance and Teleoperation

广东客 — Thu, 23 May 2024 00:58:38 +0000

Date：May 2024

Teams:University of British Columbia

Writers:David Black; Septimiu Salcudean

PDF:Robust Object Pose Tracking for Augmented Reality Guidance and Teleoperation

Abstract

For many augmented reality guidance, teleoperation, or human–robot interaction systems, accurate, fast, and robust six-degree-of-freedom (6-DOF) object pose tracking is essential. However, current solutions easily lose tracking when line-of-sight to markers is lost. In this article, we present a tracking system that matches or improves on current methods in speed and accuracy, achieving 1.77 mm and 1.51° accuracy at 22 Hz, and is robust to occlusions. Reflective markers are segmented in infrared (IR) images and used for pose computation using novel voting-based point correspondence algorithms and intelligent cropping. In addition, we introduce a new square-root unscented Kalman filter (UKF), which improves accuracy and flexibility over previous approaches by tracking the markers themselves rather than the computed pose and enabling fusion of an external inertial measurement unit (IMU). This reduces noise and makes the tracking robust to brief loss of line-of-sight. The algorithms and methods are described in detail with pseudocode, tested, and analyzed. The system is implemented in simulation and on a Microsoft HoloLens 2 using Unity for ease of integration into graphical projects. The code is made available open source. Through the improvements in speed and robustness over previous methods, this solution has the potential to enable fast and reliable pose tracking for many mixed reality (MR) and teleoperation applications.

Robust Object Pose Tracking for Augmented Reality Guidance and Teleoperation最先出现在Nweon Paper。

HoloGS: Instant Depth-based 3D Gaussian Splatting with Microsoft HoloLens 2

广东客 — Wed, 08 May 2024 02:04:57 +0000

Date：May 2024

Teams:Karlsruhe Institute of Technology

Writers:Miriam Jäger, Theodor Kapler, Michael Feßenbecker, Felix Birkelbach, Markus Hillemann, Boris Jutzi

PDF:HoloGS: Instant Depth-based 3D Gaussian Splatting with Microsoft HoloLens 2

Abstract

In the fields of photogrammetry, computer vision and computer graphics, the task of neural 3D scene reconstruction has led to the exploration of various techniques. Among these, 3D Gaussian Splatting stands out for its explicit representation of scenes using 3D Gaussians, making it appealing for tasks like 3D point cloud extraction and surface reconstruction. Motivated by its potential, we address the domain of 3D scene reconstruction, aiming to leverage the capabilities of the Microsoft HoloLens 2 for instant 3D Gaussian Splatting. We present HoloGS, a novel workflow utilizing HoloLens sensor data, which bypasses the need for pre-processing steps like Structure from Motion by instantly accessing the required input data i.e. the images, camera poses and the point cloud from depth sensing. We provide comprehensive investigations, including the training process and the rendering quality, assessed through the Peak Signal-to-Noise Ratio, and the geometric 3D accuracy of the densified point cloud from Gaussian centers, measured by Chamfer Distance. We evaluate our approach on two self-captured scenes: An outdoor scene of a cultural heritage statue and an indoor scene of a fine-structured plant. Our results show that the HoloLens data, including RGB images, corresponding camera poses, and depth sensing based point clouds to initialize the Gaussians, are suitable as input for 3D Gaussian Splatting.

HoloGS: Instant Depth-based 3D Gaussian Splatting with Microsoft HoloLens 2最先出现在Nweon Paper。

Behavioral, Peripheral, and Central Neural Correlates of Augmented Reality Guidance of Manual Tasks

广东客 — Tue, 09 Apr 2024 06:17:12 +0000

PubDate:Oct 2023

Teams: University of Pisa

Writers:Alejandro L. Callara ; Gianluca Rho ; Sara Condino ; Vincenzo Ferrari ; Enzo Pasquale Scilingo ; Alberto Greco

PDF:Behavioral, Peripheral, and Central Neural Correlates of Augmented Reality Guidance of Manual Tasks

Abstract

Objective: The use of commercially available optical-see-through (OST) head-mounted displays (HMDs) in their own peripersonal space leads the user to experience two perception conflicts that deteriorate their performance in precision manual tasks: the vergence-accommodation conflict (VAC) and the focus rivalry. In this work, we aim characterizing for the first time the psychophysiological response associated with user's incorrect focus cues during the execution of an augmented reality (AR)-guided manual task with the Microsoft HoloLens OST-HMD. Methods: 21 subjects underwent to a “connecting-the-dots” experiment with and without the use of AR, and in both binocular and monocular conditions. For each condition, we quantified the changes in autonomic nervous system (ANS) activity of subjects by analyzing the electrodermal activity (EDA) and heart rate variability. Moreover, we analyzed the neural central correlates by means of power measures of brain activity and multivariate autoregressive measures of brain connectivity extracted from the electroencephalogram (EEG). Results: No statistically significant differences of ANS correlates were observed among tasks, although all EDA-related features varied between rest and task conditions. Conversely, significant differences among conditions were present in terms of EEG-power variations in the

Behavioral, Peripheral, and Central Neural Correlates of Augmented Reality Guidance of Manual Tasks最先出现在Nweon Paper。

Ecological Validity and the Evaluation of Avatar Facial Animation Noise

广东客 — Mon, 18 Mar 2024 23:21:10 +0000

PubDate:March 2024

Teams:Microsoft Research Lab - Cambridge;Mixed Reality & AI Lab – Cambridge

Writers:Marta Wilczkowiak (SHE/HER),Ken Jakubzak,James Clemoes,Cornelia Treptow,Kerry Read,Michaela Porubanova,Daniel McDuff,Marina Kuznetsova,Sean Rintel,Mar Gonzalez-Franco

PDF:Ecological Validity and the Evaluation of Avatar Facial Animation Noise

Abstract

Facial animation noise levels affect the acceptance of avatars in communication systems. However, there is no standard for evaluation, especially with regard to ecological validity. We investigate low and high ecological validity on two within-subjects experiments conducted in Augmented Reality on a Hololens2. We simulated facial-expression noise introduced on stylized cartoon avatars, and found that in the high ecological validity experiment, subjects were less sensitive to noise parameters, but their judgement was more influenced by empathy scores and gender biases. This highlights the importance of considering both technical parameters and user experience when designing communication systems. We make some general recommendations for evaluating issues of avatar acceptance given the trade-offs between the approaches, and propose the ‘Triple C’ factors of Context, Culture and Character as an important set of ecological factors to consider.

Ecological Validity and the Evaluation of Avatar Facial Animation Noise最先出现在Nweon Paper。

Accelerating computational materials discovery with artificial intelligence and cloud high-performance computing: from large-scale screening to experimental validation

广东客 — Tue, 16 Jan 2024 06:33:51 +0000

PubDate: Jan 2024

Teams:Microsoft;Pacific Northwest National Laboratory

Writers:Chi Chen, Dan Thien Nguyen, Shannon J. Lee, Nathan A. Baker, Ajay S. Karakoti, Linda Lauw, Craig Owen, Karl T. Mueller, Brian A. Bilodeau, Vijayakumar Murugesan, Matthias Troyer

PDF:Accelerating computational materials discovery with artificial intelligence and cloud high-performance computing: from large-scale screening to experimental validation

Abstract

High-throughput computational materials discovery has promised significant acceleration of the design and discovery of new materials for many years. Despite a surge in interest and activity, the constraints imposed by large-scale computational resources present a significant bottleneck. Furthermore, examples of large-scale computational discovery carried through experimental validation remain scarce, especially for materials with product applicability. Here we demonstrate how this vision became reality by first combining state-of-the-art artificial intelligence (AI) models and traditional physics-based models on cloud high-performance computing (HPC) resources to quickly navigate through more than 32 million candidates and predict around half a million potentially stable materials. By focusing on solid-state electrolytes for battery applications, our discovery pipeline further identified 18 promising candidates with new compositions and rediscovered a decade's worth of collective knowledge in the field as a byproduct. By employing around one thousand virtual machines (VMs) in the cloud, this process took less than 80 hours. We then synthesized and experimentally characterized the structures and conductivities of our top candidates, the NaxLi3−xYCl6 (0<x<3) series, demonstrating the potential of these compounds to serve as solid electrolytes. Additional candidate materials that are currently under experimental investigation could offer more examples of the computational discovery of new phases of Li- and Na-conducting solid electrolytes. We believe that this unprecedented approach of synergistically integrating AI models and cloud HPC not only accelerates materials discovery but also showcases the potency of AI-guided experimentation in unlocking transformative scientific breakthroughs with real-world applications.

Accelerating computational materials discovery with artificial intelligence and cloud high-performance computing: from large-scale screening to experimental validation最先出现在Nweon Paper。

ARWalker: A Virtual Walking Companion Application

映维 — Wed, 22 Nov 2023 07:40:23 +0000

PubDate: Nov 2023

Teams: University of Nebraska

Writers: Pubudu Wijesooriya, Aaron Likens, Nick Stergiou, Spyridon Mastorakis

PDF: ARWalker: A Virtual Walking Companion Application

Abstract

Extended Reality (XR) technologies, including Augmented Reality (AR), have attracted significant attention over the past few years and have been utilized in several fields, including education, healthcare, and manufacturing. In this paper, we aim to explore the use of AR in the field of biomechanics and human movement through the development of ARWalker, which is an AR application that features virtual walking companions (avatars). Research participants walk in close synchrony with the virtual companions, whose gait exhibits properties found in the gait of young and healthy adults. As a result, research participants can train their gait to the gait of the avatar, thus regaining the healthy properties of their gait and reducing the risk of falls. ARWalker can especially help older adults and individuals with diseases, who exhibit pathological gait thus being more prone to falls. We implement a prototype of ARWalker and evaluate its systems performance while running on a Microsoft Hololens 2 headset.

ARWalker: A Virtual Walking Companion Application最先出现在Nweon Paper。

Efficient Cloud Pipelines for Neural Radiance Fields

映维 — Thu, 16 Nov 2023 06:49:20 +0000

PubDate: Nov 2023

Teams: Univ of Victoria;Northeastern University

Writers: Derek Jacoby, Donglin Xu, Weder Ribas, Minyi Xu, Ting Liu, Vishwanath Jayaraman, Mengdi Wei, Emma De Blois, Yvonne Coady

PDF: Efficient Cloud Pipelines for Neural Radiance Fields

Abstract

Since their introduction in 2020, Neural Radiance Fields (NeRFs) have taken the computer vision community by storm. They provide a multi-view representation of a scene or object that is ideal for eXtended Reality (XR) applications and for creative endeavors such as virtual production, as well as change detection operations in geospatial analytics. The computational cost of these generative AI models is quite high, however, and the construction of cloud pipelines to generate NeRFs is neccesary to realize their potential in client applications. In this paper, we present pipelines on a high performance academic computing cluster and compare it with a pipeline implemented on Microsoft Azure. Along the way, we describe some uses of NeRFs in enabling novel user interaction scenarios.

Efficient Cloud Pipelines for Neural Radiance Fields最先出现在Nweon Paper。

A Real-time Method for Inserting Virtual Objects into Neural Radiance Fields

映维 — Mon, 06 Nov 2023 06:31:27 +0000

PubDate: Oct 2023

Teams: Zhejiang University, Hangzhou,；Microsoft Research Asia

Writers: Keyang Ye, Hongzhi Wu, Xin Tong, Kun Zhou

PDF: A Real-time Method for Inserting Virtual Objects into Neural Radiance Fields

Abstract

We present the first real-time method for inserting a rigid virtual object into a neural radiance field, which produces realistic lighting and shadowing effects, as well as allows interactive manipulation of the object. By exploiting the rich information about lighting and geometry in a NeRF, our method overcomes several challenges of object insertion in augmented reality. For lighting estimation, we produce accurate, robust and 3D spatially-varying incident lighting that combines the near-field lighting from NeRF and an environment lighting to account for sources not covered by the NeRF. For occlusion, we blend the rendered virtual object with the background scene using an opacity map integrated from the NeRF. For shadows, with a precomputed field of spherical signed distance field, we query the visibility term for any point around the virtual object, and cast soft, detailed shadows onto 3D surfaces. Compared with state-of-the-art techniques, our approach can insert virtual object into scenes with superior fidelity, and has a great potential to be further applied to augmented reality systems.

A Real-time Method for Inserting Virtual Objects into Neural Radiance Fields最先出现在Nweon Paper。

A 3D Mixed Reality Interface for Human-Robot Teaming

映维 — Wed, 25 Oct 2023 07:22:33 +0000

PubDate: Oct 2023

Teams: ETH Zurich;Microsoft Mixed Reality

Writers: Jiaqi Chen, Boyang Sun, Marc Pollefeys, Hermann Blum

PDF: A 3D Mixed Reality Interface for Human-Robot Teaming

Abstract

This paper presents a mixed-reality human-robot teaming system. It allows human operators to see in real-time where robots are located, even if they are not in line of sight. The operator can also visualize the map that the robots create of their environment and can easily send robots to new goal positions. The system mainly consists of a mapping and a control module. The mapping module is a real-time multi-agent visual SLAM system that co-localizes all robots and mixed-reality devices to a common reference frame. Visualizations in the mixed-reality device then allow operators to see a virtual life-sized representation of the cumulative 3D map overlaid onto the real environment. As such, the operator can effectively “see through” walls into other rooms. To control robots and send them to new locations, we propose a drag-and-drop interface. An operator can grab any robot hologram in a 3D mini map and drag it to a new desired goal pose. We validate the proposed system through a user study and real-world deployments. We make the mixed-reality application publicly available at this https URL.

A 3D Mixed Reality Interface for Human-Robot Teaming最先出现在Nweon Paper。

VibHead: An Authentication Scheme for Smart Headsets through Vibration

映维 — Thu, 27 Jul 2023 07:28:23 +0000

PubDate: June 2023

Teams: Shandong University;Qingdao University

Writers: Feng Li, Jiayi Zhao, Huan Yang, Dongxiao Yu, Yuanfeng Zhou, Yiran Shen

PDF: VibHead: An Authentication Scheme for Smart Headsets through Vibration

Abstract

Recent years have witnessed the fast penetration of Virtual Reality (VR) and Augmented Reality (AR) systems into our daily life, the security and privacy issues of the VR/AR applications have been attracting considerable attention. Most VR/AR systems adopt head-mounted devices (i.e., smart headsets) to interact with users and the devices usually store the users’ private data. Hence, authentication schemes are desired for the head-mounted devices. Traditional knowledge-based authentication schemes for general personal devices have been proved vulnerable to shoulder-surfing attacks, especially considering the headsets may block the sight of the users. Although the robustness of the knowledge-based authentication can be improved by designing complicated secret codes in virtual space, this approach induces a compromise of usability. Another choice is to leverage the users’ biometrics; however, it either relies on highly advanced equipments which may not always be available in commercial headsets or introduce heavy cognitive load to users.
In this paper, we propose a vibration-based authentication scheme, VibHead, for smart headsets. Since the propagation of vibration signals through human heads presents unique patterns for different individuals, VibHead employs a CNN-based model to classify registered legitimate users based the features extracted from the vibration signals. We also design a two-step authentication scheme where the above user classifiers are utilized to distinguish the legitimate user from illegitimate ones. We implement VibHead on a Microsoft HoloLens equipped with a linear motor and an IMU sensor which are commonly used in off-the-shelf personal smart devices. According to the results of our extensive experiments, with short vibration signals (≤1s), VibHead has an outstanding authentication accuracy; both FAR and FRR are around 5%.

VibHead: An Authentication Scheme for Smart Headsets through Vibration最先出现在Nweon Paper。

ArK: Augmented Reality with Knowledge Interactive Emergent Ability

映维 — Wed, 10 May 2023 04:51:15 +0000

PubDate: Apr 2023

Teams: Microsoft Research, Redmond † MILA §University of Washington \ UCLA

Writers: Qiuyuan Huang, Jae Sung Park, Abhinav Gupta, Paul Bennett, Ran Gong, Subhojit Som, Baolin Peng, Owais Khan Mohammed, Chris Pal, Yejin Choi, Jianfeng Gao

PDF: ArK: Augmented Reality with Knowledge Interactive Emergent Ability

Abstract

Despite the growing adoption of mixed reality and interactive AI agents, it remains challenging for these systems to generate high quality 2D/3D scenes in unseen environments. The common practice requires deploying an AI agent to collect large amounts of data for model training for every new task. This process is costly, or even impossible, for many domains. In this study, we develop an infinite agent that learns to transfer knowledge memory from general foundation models (e.g. GPT4, DALLE) to novel domains or scenarios for scene understanding and generation in the physical or virtual world. The heart of our approach is an emerging mechanism, dubbed Augmented Reality with Knowledge Inference Interaction (ArK), which leverages knowledge-memory to generate scenes in unseen physical world and virtual reality environments. The knowledge interactive emergent ability (Figure 1) is demonstrated as the observation learns i) micro-action of cross-modality: in multi-modality models to collect a large amount of relevant knowledge memory data for each interaction task (e.g., unseen scene understanding) from the physical reality; and ii) macro-behavior of reality-agnostic: in mix-reality environments to improve interactions that tailor to different characterized roles, target variables, collaborative information, and so on. We validate the effectiveness of ArK on the scene generation and editing tasks. We show that our ArK approach, combined with large foundation models, significantly improves the quality of generated 2D/3D scenes, compared to baselines, demonstrating the potential benefit of incorporating ArK in generative AI for applications such as metaverse and gaming simulation.

ArK: Augmented Reality with Knowledge Interactive Emergent Ability最先出现在Nweon Paper。

Inside-out Infrared Marker Tracking via Head Mounted Displays for Smart Robot Programming

映维 — Thu, 06 Apr 2023 05:37:22 +0000

PubDate: Mar 2023

Teams: Karlsruhe Institute of Technology;Karlsruhe University of Applied Sciences

Writers: David Puljiz, Alexandru-George Vasilache, Michael Mende, Björn Hein

PDF: Inside-out Infrared Marker Tracking via Head Mounted Displays for Smart Robot Programming

Abstract

Intuitive robot programming through use of tracked smart input devices relies on fixed, external tracking systems, most often employing infra-red markers. Such an approach is frequently combined with projector-based augmented reality for better visualisation and interface. The combined system, although providing an intuitive programming platform with short cycle times even for inexperienced users, is immobile, expensive and requires extensive calibration. When faced with a changing environment and large number of robots it becomes sorely impractical. Here we present our work on infra-red marker tracking using the Microsoft HoloLens head-mounted display. The HoloLens can map the environment, register the robot on-line, and track smart devices equipped with infra-red markers in the robot coordinate system. We envision our work to provide the basis to transfer many of the paradigms developed over the years for systems requiring a projector and a tracked input device into a highly-portable system that does not require any calibration or special set-up. We test the quality of the marker-tracking in an industrial robot cell and compare our tracking with a ground truth obtained via an ART-3 tracking system.

Inside-out Infrared Marker Tracking via Head Mounted Displays for Smart Robot Programming最先出现在Nweon Paper。

Implementation of communication media around a mixed reality experience with HoloLens headset, as part of a digitalization of a nutrition workshop

映维 — Wed, 29 Mar 2023 04:22:22 +0000

PubDate: Mar 2023

Teams: Université Clermont Auvergne；CNRS；Services de Chirurgie Bariatrique/Nutrition；Unité transversale d’éducation du patien；Unité de Recherche Clinique

Writers: Owen Kevin Appadoo, Hugo Rositi, Sylvie Valarier, Marie-Claire Ombret, Émilie Gadéa, Christine Barret-Grimault, Christophe Lohou

PDF: Implementation of communication media around a mixed reality experience with HoloLens headset, as part of a digitalization of a nutrition workshop

Abstract

The release of Microsoft’s HoloLens headset addresses new types of issues that would have been difficult to design without such a hardware. This semi-transparent visor headset allows the user who wears it to view the projection of 3D virtual objects placed in its real environment. The user can also interact with these 3D objects, which can interact with each other. The framework of this new technology is called mixed reality. We had the opportunity to numerically transform a conventional human nutrition workshop for patients waiting for bariatric surgery by developing a software called HOLO_NUTRI using the HoloLens headset. Unlike our experience of user and conventional programmer specialized in the development of interactive 3D graphics applications, we realized that such a mixed reality experience required specific programming concepts quite different from those of conventional software or those of virtual reality applications, but above all required a thorough reflection about communication for users. In this article, we propose to explain our design of communication (graphic supports, tutorials of use of material, explanatory videos), a step which was crucial for the good progress of our project. The software was used by thirty patients from Le Puy-en-Velay Hospital during 10 sessions of one hour and a half during which patients had to take in hand the headset and software HOLO_NUTRI. We also proposed a series of questions to patients to have an assessment of both the adequacy and the importance of this communication approach for such experience. As the mixed reality technology is very recent but the number of applications based on it significantly increases, the reflection on the implementation of the elements of communication described in this article (videos, exercise of learning for the use of the headset, communication leaflet, etc.) can help developers of such applications.

Implementation of communication media around a mixed reality experience with HoloLens headset, as part of a digitalization of a nutrition workshop最先出现在Nweon Paper。

Augmenting Augmented Reality with Non-Line-of-Sight Perception

映维 — Tue, 28 Feb 2023 01:13:39 +0000

PubDate: Feb 2023

Teams: 1 Massachusetts Institute of Technology, 2 University of Michigan

Writers: Tara Boroushaki 1, Maisy Lam 1, Laura Dodds 1, Aline Eid 1,2, Fadel Adib 1

PDF: Augmenting Augmented Reality with Non-Line-of-Sight Perception

Abstract

We present the design, implementation, and evaluation of X-AR, an augmented reality (AR) system with non-line-of-sight perception. X-AR augments AR headsets with RF sensing to enable users to see things that are otherwise invisible to the human eye or to stateof-the-art AR systems. Our design introduces three main innovations: the first is an AR-conformal antenna that tightly matches the shape of the AR headset visor while providing excellent radiation and bandwidth capabilities for RF sensing. The second is an RF-visual synthetic aperture localization algorithm that leverages natural human mobility to localize RF-tagged objects in line-ofsight and non-line-of-sight settings. Finally, the third is an RF-visual verification primitive that fuses RF and vision to deliver actionable tasks to end users such as picking verification. We built an end-to-end prototype of our design by integrating it into a Microsoft Hololens 2 AR headset and evaluated it in line-of-sight and nonline-of-sight environments. Our results demonstrate that X-AR achieves decimeter-level RF localization (median of 9.8 cm) of fully-occluded items and can perform RFvisual picking verification with over 95% accuracy (FScore) when extracting RFID-tagged items.These results show that X-AR is successful in extending AR systems to non-line-of-sight perception, with important implications to manufacturing, warehousing, and smart home applications. Demo video: y2u.be/bdUN21ft7G0

Augmenting Augmented Reality with Non-Line-of-Sight Perception最先出现在Nweon Paper。

Spatial audio signal processing for binaural reproduction of recorded acoustic scenes – review and challenges

映维 — Tue, 31 Jan 2023 05:46:28 +0000

PubDate: Oct 2022

Teams: Ben-Gurion University of the Negev,Meta;International Audio Laboratories Erlangen;University of Huddersfield;, Microsoft Research;The Australian National University

Writers: Boaz Rafaely, Vladimir Tourbabin, Emanuel Habets, Zamir Ben-Hur, Hyunkook Lee, Hannes Gamper, Lior Arbel, Lachlan Birnie, Thushara Abhayapala, Prasanga Samarasinghe

PDF: Spatial audio signal processing for binaural reproduction of recorded acoustic scenes – review and challenges

Abstract

Spatial audio has been studied for several decades, but has seen much renewed interest recently due to advances in both software and hardware for capture and playback, and the emergence of applications such as virtual reality and augmented reality. This renewed interest has led to the investment of increasing efforts in developing signal processing algorithms for spatial audio, both for capture and for playback. In particular, due to the popularity of headphones and earphones, many spatial audio signal processing methods have dealt with binaural reproduction based on headphone listening. Among these new developments, processing spatial audio signals recorded in real environments using microphone arrays plays an important role. Following this emerging activity, this paper aims to provide a scientific review of recent developments and an outlook for future challenges. This review also proposes a generalized framework for describing spatial audio signal processing for the binaural reproduction of recorded sound. This framework helps to understand the collective progress of the research community, and to identify gaps for future research. It is composed of five main blocks, namely: the acoustic scene, recording, processing, reproduction, and perception and evaluation. First, each block is briefly presented, and then, a comprehensive review of the processing block is provided. This includes topics from simple binaural recording to Ambisonics and perceptually motivated approaches, which focus on careful array configuration and design. Beamforming and parametric-based processing afford more flexible designs and shift the focus to processing and modeling of the sound field. Then, emerging machine- and deep-learning approaches, which take a further step towards flexibility in design, are described. Finally, specific methods for signal transformations such as rotation, translation and enhancement, enabling additional flexibility in reproduction and improvement in the quality of the binaural signal, are presented. The review concludes by highlighting directions for future research.

Spatial audio signal processing for binaural reproduction of recorded acoustic scenes – review and challenges最先出现在Nweon Paper。

Imitator: Personalized Speech-driven 3D Facial Animation

映维 — Wed, 18 Jan 2023 04:31:20 +0000

PubDate: Dec 2022

Teams: Max Planck Institute for Intelligent Systems;Microsoft Mixed Reality & AI Lab

Writers: Balamurugan Thambiraja, Ikhsanul Habibie, Sadegh Aliakbarian, Darren Cosker, Christian Theobalt, Justus Thies

PDF: Imitator: Personalized Speech-driven 3D Facial Animation

Abstract

Speech-driven 3D facial animation has been widely explored, with applications in gaming, character animation, virtual reality, and telepresence systems. State-of-the-art methods deform the face topology of the target actor to sync the input audio without considering the identity-specific speaking style and facial idiosyncrasies of the target actor, thus, resulting in unrealistic and inaccurate lip movements. To address this, we present Imitator, a speech-driven facial expression synthesis method, which learns identity-specific details from a short input video and produces novel facial expressions matching the identity-specific speaking style and facial idiosyncrasies of the target actor. Specifically, we train a style-agnostic transformer on a large facial expression dataset which we use as a prior for audio-driven facial expressions. Based on this prior, we optimize for identity-specific speaking style based on a short reference video. To train the prior, we introduce a novel loss function based on detected bilabial consonants to ensure plausible lip closures and consequently improve the realism of the generated expressions. Through detailed experiments and a user study, we show that our approach produces temporally coherent facial expressions from input audio while preserving the speaking style of the target actors.

Imitator: Personalized Speech-driven 3D Facial Animation最先出现在Nweon Paper。

Gaze-Vergence-Controlled See-Through Vision in Augmented Reality

广东客 — Wed, 07 Dec 2022 02:16:45 +0000

PubDate: Sep 2022

Teams: Beihang University

Writers: Zhimin Wang; Yuxin Zhao; Feng Lu

PDF: Gaze-Vergence-Controlled See-Through Vision in Augmented Reality

Abstract

Augmented Reality (AR) see-through vision is an interesting research topic since it enables users to see through a wall and see the occluded objects. Most existing research focuses on the visual effects of see-through vision, while the interaction method is less studied. However, we argue that using common interaction modalities, e.g., midair click and speech, may not be the optimal way to control see-through vision. This is because when we want to see through something, it is physically related to our gaze depth/vergence and thus should be naturally controlled by the eyes. Following this idea, this paper proposes a novel gaze-vergence-controlled (GVC) see-through vision technique in AR. Since gaze depth is needed, we build a gaze tracking module with two infrared cameras and the corresponding algorithm and assemble it into the Microsoft HoloLens 2 to achieve gaze depth estimation. We then propose two different GVC modes for see-through vision to fit different scenarios. Extensive experimental results demonstrate that our gaze depth estimation is efficient and accurate. By comparing with conventional interaction modalities, our GVC techniques are also shown to be superior in terms of efficiency and more preferred by users. Finally, we present four example applications of gaze-vergence-controlled see-through vision.

Gaze-Vergence-Controlled See-Through Vision in Augmented Reality最先出现在Nweon Paper。

LaMAR: Benchmarking Localization and Mapping for Augmented Reality

映维 — Mon, 24 Oct 2022 13:37:25 +0000

PubDate: Oct 2022

Teams: ETH Zurich；Microsoft Mixed Reality & AI Lab

Writers: Paul-Edouard Sarlin, Mihai-Alexandru Dusmanu, Johannes L. Schönberger, Pablo Speciale, Lukas Gruber, Viktor Larsson, Ondrej Miksik, Marc Pollefeys

PDF: LaMAR: Benchmarking Localization and Mapping for Augmented Reality

Abstract

Localization and mapping is the foundational technology for augmented reality (AR) that enables sharing and persistence of digital content in the real world. While significant progress has been made, researchers are still mostly driven by unrealistic benchmarks not representative of real-world AR scenarios. In particular, benchmarks are often based on small-scale datasets with low scene diversity, captured from stationary cameras, and lacking other sensor inputs like inertial, radio, or depth data. Furthermore, ground-truth (GT) accuracy is mostly insufficient to satisfy AR requirements. To close this gap, we introduce a new benchmark with a comprehensive capture and GT pipeline, which allow us to co-register realistic AR trajectories in diverse scenes and from heterogeneous devices at scale. To establish accurate GT, our pipeline robustly aligns the captured trajectories against laser scans in a fully automatic manner. Based on this pipeline, we publish a benchmark dataset of diverse and large-scale scenes recorded with head-mounted and hand-held AR devices. We extend several state-of-the-art methods to take advantage of the AR specific setup and evaluate them on our benchmark. Based on the results, we present novel insights on current research gaps to provide avenues for future work in the community.

LaMAR: Benchmarking Localization and Mapping for Augmented Reality最先出现在Nweon Paper。

3D Face Reconstruction with Dense Landmarks

映维 — Mon, 24 Oct 2022 13:37:23 +0000

PubDate: Oct 2022

Teams: Microsoft

Writers: Erroll Wood Tadas Baltrušaitis Charlie Hewitt Matthew Johnson Jingjing Shen Nikola Milosavljevic Daniel Wilde Stephan Garbin Chirag Raman Jamie Shotton Toby Sharp Ivan Stojiljkovic Thomas J. Cashman Julien Valentin

PDF: 3D Face Reconstruction with Dense Landmarks

Abstract

Landmarks often play a key role in face analysis, but many aspects of identity or expression cannot be represented by sparse landmarks alone. Thus, in order to reconstruct faces more accurately, landmarks are often combined with additional signals like depth images or techniques like differentiable rendering.

Can we keep things simple by just using more landmarks?

In answer, we present the first method that accurately predicts ten times as many landmarks as usual, covering the whole head, including the eyes and teeth. This is accomplished using synthetic training data, which guarantees perfect landmark annotations. By fitting a morphable model to these dense landmarks, we achieve state-of-the-art results for monocular 3D face reconstruction in the wild. We show that dense landmarks are an ideal signal for integrating face shape information across frames by demonstrating accurate and expressive facial performance capture in both monocular and multi-view scenarios. Finally, our method is highly efficient: we can predict dense landmarks and fit our 3D face model at over 150FPS on a single CPU thread.

3D Face Reconstruction with Dense Landmarks最先出现在Nweon Paper。

Attention Guidance for Tower ATC Using Augmented Reality Devices

映维 — Wed, 21 Sep 2022 04:13:22 +0000

PubDate: May 2022

Teams: Royal Netherlands Aerospace Centre

Writers: Jürgen Teutsch; Tanja Bos; Marcel van Apeldoorn; Lansenou Camara

PDF: Attention Guidance for Tower ATC Using Augmented Reality Devices

Abstract

In 2021 Royal NLR carried out innovative technology experiments on their high-fidelity real-time air traffic control simulation and validation platform, NARSIM. These experiments were part of the SESAR 2020 project Digital Technologies for Tower (DTT). The technology option that was investigated focused on advanced HMI interaction modes for aerodrome tower controllers. More particular, Attention Capturing and Guidance strategies with an Augmented Reality device, the Microsoft HoloLens 2™, were evaluated inside an aerodrome control tower environment for Amsterdam Airport Schiphol, one of the major European hub airports.The NARSIM environment consisted of a realistic but downscaled presentation of the airport with two tower controller working positions emulating current tower systems. Such a set-up allowed researchers to focus their work on the application of Augmented Reality with the introduction of (virtual) aircraft labels as well as special symbology and auditory cues for capturing and guiding tower controller attention in the case of critical events. Several typical attention-critical events that may occur at an airport, such as go-around operations and runway incursions, were orchestrated by a team of NLR experts and presented to the tower controllers while they were operating traffic as usual. Human performance and ATC operational experts observed and analyzed the simulations.This paper describes the steps taken and the challenges encountered when integrating the HoloLens inside the NARSIM Tower environment. Furthermore, it explores the proposed operational concept for Attention Capturing and Guidance with the HoloLens and how it was realized inside the device. The results of the technical evaluation activity with two experienced air traffic controllers are described in detail. These results came to the conclusion that the device in combination with the concept was a favorable addition to the controller working environment. While desired technical performance improvements, mostly related to user comfort and general adjustments, depend on further vendor development, the used HoloLens was seen as a technically useful device for implementing prototype applications for Attention Capturing and Guidance with aural and visual cues.In the final sections of the paper, an outlook into the expected future use of Augmented Reality devices in conventional control tower environments is given.

Attention Guidance for Tower ATC Using Augmented Reality Devices最先出现在Nweon Paper。