Listen to Look: Action Recognition by Previewing Audio

编辑：映维 | 分类：XR | 2020年5月14日

Note: We don't have the ability to review paper

PubDate: Dec 2019

Teams: The University of Texas at Austin Facebook AI Research

Writers: Ruohan Gao* Tae-Hyun Oh Kristen Grauman† Lorenzo Torresani

PDF: Listen to Look: Action Recognition by Previewing Audio

Project: Listen to Look: Action Recognition by Previewing Audio

Listen to Look: Action Recognition by Previewing Audio

Abstract

In the face of the video data deluge, today’s expensive clip-level classifiers are increasingly impractical. We propose a framework for efficient action recognition in untrimmed video that uses audio as a preview mechanism to eliminate both short-term and long-term visual redundancies. First, we devise an ImgAud2Vid framework that hallucinates clip-level features by distilling from lighter modalities—a single frame and its accompanying audio—reducing short-term temporal redundancy for efficient clip-level recognition. Second, building on ImgAud2Vid, we further propose ImgAud-Skimming, an attention-based long short-term memory network that iteratively selects useful moments in untrimmed videos, reducing long-term temporal redundancy for efficient video-level recognition. Extensive experiments on four action recognition datasets demonstrate that our method achieves the state-of-the-art in terms of both recognition accuracy and speed.

本文链接：https://paper.nweon.com/1024

Listen to Look: Action Recognition by Previewing Audio

您可能还喜欢...

最新AR/VR行业分享

最新AR/VR专利

最新AR/VR行业招聘

Listen to Look: Action Recognition by Previewing Audio

您可能还喜欢...

Comparison of Subjective Quality Evaluation for HEVC Encoded Omnidirectional Videos at Different Bit-rates for UHD and FHD Resolution

Collaboration Face-to-Face and in Virtual Reality - Empathy, Social Closeness, and Task Load

emteqPRO: Face-mounted Mask for Emotion Recognition and Affective Computing

最新AR/VR行业分享

最新AR/VR专利

最新AR/VR行业招聘