Designing Parameter and Compute Efficient Diffusion Transformers using Distillation

编辑：广东客 | 分类：CV | 2025年6月5日

Note: We don't have the ability to review paper

PubDate: Feb 2025

Teams:University of Illinois Urbana Champaign

Writers：Vignesh Sundaresha

PDF:Designing Parameter and Compute Efficient Diffusion Transformers using Distillation

Abstract

Diffusion Transformers (DiTs) with billions of model parameters form the backbone of popular image and video generation models like DALL.E, Stable-Diffusion and SORA. Though these models are necessary in many low-latency applications like Augmented/Virtual Reality, they cannot be deployed on resource-constrained Edge devices (like Apple Vision Pro or Meta Ray-Ban glasses) due to their huge computational complexity. To overcome this, we turn to knowledge distillation and perform a thorough design-space exploration to achieve the best DiT for a given parameter size. In particular, we provide principles for how to choose design knobs such as depth, width, attention heads and distillation setup for a DiT. During the process, a three-way trade-off emerges between model performance, size and speed that is crucial for Edge implementation of diffusion. We also propose two distillation approaches - Teaching Assistant (TA) method and Multi-In-One (MI1) method - to perform feature distillation in the DiT context. Unlike existing solutions, we demonstrate and benchmark the efficacy of our approaches on practical Edge devices such as NVIDIA Jetson Orin Nano.

本文链接：https://paper.nweon.com/16353

Designing Parameter and Compute Efficient Diffusion Transformers using Distillation

您可能还喜欢...

最新AR/VR行业分享

最新AR/VR专利

最新AR/VR行业招聘

Designing Parameter and Compute Efficient Diffusion Transformers using Distillation

您可能还喜欢...

Two at Once: Enhancing Learning and Generalization Capacities via IBN-Net

High-fidelity Face Tracking for AR/VR via Deep Lighting Adaptation

Pretrained Diffusion Models for Unified Human Motion Synthesis

最新AR/VR行业分享

最新AR/VR专利

最新AR/VR行业招聘