H4H: Hybrid Convolution-Transformer Architecture Search for NPU-CIM Heterogeneous Systems for AR/VR Applications

编辑：广东客 | 分类：CV | 2025年8月26日

Note: We don't have the ability to review paper

PubDate: May 2025

Teams:1Carnegie Mellon University, 2Meta Reality Labs, 3New York University, 4Cornell Tech

Writers:Yiwei Zhao1, Jinhui Chen2, Sai Qian Zhang3, Syed Shakib Sarwar2, Kleber Hugo Stangherlin2,Jorge Tomas Gomez2, Jae-Sun Seo4, Barbara De Salvo2
, Chiao Liu2, Phillip B. Gibbons1, Ziyun Li

PDF:H4H: Hybrid Convolution-Transformer Architecture Search for NPU-CIM Heterogeneous Systems for AR/VR Applications

Abstract

Low-latency and low-power edge AI is crucial for Augmented/Virtual Reality applications. Recent advances demonstrate that hybrid machine learning (ML) models, combining convolution neural networks (CNN) and vision transformers (ViT), often achieve a superior accuracy/performance tradeoff. However, hybrid ML models can present system challenges for latency and energy efficiency due to their diverse nature in dataflow and memory access patterns.

In this work, we leverage architecture heterogeneity from Neural Processing Units (NPU) and Compute-In-Memory (CIM) and explore diverse execution schemas for efficient hybrid model executions. We introduce H4H-NAS, a two-stage Neural Architecture Search (NAS) framework to automate the design of hybrid CNN/ViT
models for heterogeneous edge systems featuring both NPU and CIM. We propose a two-phase incremental supernet training in our NAS to improve model accuracy. Our H4H-NAS approach is also powered by a performance estimator built with NPU performance results measured on real silicon, and CIM performance based on industry IPs. H4H-NAS achieves significant (up to 1.34%) top-1 accuracy improvement on ImageNet-1k classification, and up to 56.08% overall latency and 41.72% energy improvements.

本文链接：https://paper.nweon.com/16452

H4H: Hybrid Convolution-Transformer Architecture Search for NPU-CIM Heterogeneous Systems for AR/VR Applications

您可能还喜欢...

最新AR/VR行业分享

最新AR/VR专利

最新AR/VR行业招聘

H4H: Hybrid Convolution-Transformer Architecture Search for NPU-CIM Heterogeneous Systems for AR/VR Applications

您可能还喜欢...

Contact Lens with Moiré patterns for High-Precision Eye Tracking

SelfPose: 3D Egocentric Pose Estimation From a Headset Mounted Camera

PAConv: Position Adaptive Convolution with Dynamic Kernel Assembling on Point Clouds

最新AR/VR行业分享

最新AR/VR专利

最新AR/VR行业招聘