Optimizing Network Structure for 3D Human Pose Estimation
Title: Optimizing Network Structure for 3D Human Pose Estimation
Teams: Microsoft
Writers: Hai Ci Chunyu Wang Xiaoxuan Ma Yizhou Wang
Publication date: October 2019
Abstract
Human pose is essentially a skeletal graph where the joints are the nodes and the bones linking the joints are the edges.
So it is natural to apply Graph Convolutional Network (GCN) to estimate 3D poses from 2D poses. In this work, we factor the Laplacian operator in GCN into the product of a structure matrix and a weight matrix. Based on the formulation we show that GCN has limited representation ability when it is used for estimating 3D poses. We overcome the limitation by introducing Locally Connected Network (LCN) which constructs the two matrices based on human anatomy. It notably improves the representation ability over GCN. In addition, since every joint is only connected to a small number of joints in its neighborhood, it has strong generalization ability. The experiments on public datasets show it: (1) outperforms the state-of-the-arts by a notable margin; (2) is less data hungry than alternative models; (3) generalizes well to unseen actions, datasets and even noisy 2D poses.