Learning View-Disentangled Human Pose Representation by Contrastive Cross-View Mutual Information Maximization

编辑：映维 | 分类：CV / XR | 2021年6月23日

Note: We don't have the ability to review paper

PubDate: Mar 2021

Teams: Rutgers University 2Google Research 3Caltech 4University of Delaware

Writers: Long Zhao, Yuxiao Wang, Jiaping Zhao, Liangzhe Yuan, Jennifer J. Sun, Florian Schroff, Hartwig Adam, Xi Peng, Dimitris Metaxas, Ting Liu

PDF: Learning View-Disentangled Human Pose Representation by Contrastive Cross-View Mutual Information Maximization

Learning View-Disentangled Human Pose Representation by Contrastive Cross-View Mutual Information Maximization

Abstract

We introduce a novel representation learning method to disentangle pose-dependent as well as view-dependent factors from 2D human poses. The method trains a network using cross-view mutual information maximization (CV-MIM) which maximizes mutual information of the same pose performed from different viewpoints in a contrastive learning manner. We further propose two regularization terms to ensure disentanglement and smoothness of the learned representations. The resulting pose representations can be used for cross-view action recognition. To evaluate the power of the learned representations, in addition to the conventional fully-supervised action recognition settings, we introduce a novel task called single-shot cross-view action recognition. This task trains models with actions from only one single viewpoint while models are evaluated on poses captured from all possible viewpoints. We evaluate the learned representations on standard benchmarks for action recognition, and show that (i) CV-MIM performs competitively compared with the state-of-the-art models in the fully-supervised scenarios; (ii) CV-MIM outperforms other competing methods by a large margin in the single-shot cross-view setting; (iii) and the learned representations can significantly boost the performance when reducing the amount of supervised training data. Our code is made publicly available at this https URL

本文链接：https://paper.nweon.com/10302

Learning View-Disentangled Human Pose Representation by Contrastive Cross-View Mutual Information Maximization

您可能还喜欢...

最新AR/VR行业分享

最新AR/VR专利

最新AR/VR行业招聘

Learning View-Disentangled Human Pose Representation by Contrastive Cross-View Mutual Information Maximization

您可能还喜欢...

DPLM: A Deep Perceptual Spatial-Audio Localization Metric

Practical Commercial 5G Standalone (SA) Uplink Throughput Prediction

Robust optical see-through head-mounted display calibration: Taking anisotropic nature of user interaction errors into account

最新AR/VR行业分享

最新AR/VR专利

最新AR/VR行业招聘