Key Algorithm for Human Motion Recognition in Virtual Reality Video Sequences Based on Hidden Markov Model
PubDate: August 2020
Teams: Jiamusi University
Writers: Lei Liu; Yufeng Jiao; Fanwei Meng
Abstract
This paper provides an in-depth discussion of human motion recognition in Virtual Reality (VR) video sequences through hidden Markov models, which are four steps from VR video acquisition and pre-processing, foreground detection, extraction of human feature parameters, and hidden Markov model human motion recognition. A hybrid Gaussian model was used to build a background model in real-time based on changes in VR video information, and the image was subtracted by the background differential method. The optical flow method was used for foreground detection of the target, and the effects of sparse and dense optical flow were compared to obtain the motion characteristics and optical flow information of the target human body, respectively. Features were extracted for human motion, in terms of common geometric features of the body and optical flow information, respectively. In terms of common geometric information, the width-to-height ratio, perimeter-to-area ratio, center of mass, eccentricity, and feature angle were extracted, respectively. For the optical flow information, optical flow descriptors were constructed using a grid-based approach. And feature fusion was performed for the above two parameters by the k-means method to construct the word pocket model. The hidden Markov model parameters were obtained by using the hidden Markov model for the recognition of human motion and training the human feature parameters for each of the four motions. The recognition of the four common human body movements was realized by the forward-backward algorithm. The test results show that the motion recognition method in this paper has high recognition performance and good anti-interference performance. The time-sequence pooling is used to sort the effective video frame feature sequences to obtain the feature vectors that can represent the dynamic changes of video time sequence; finally, the time-sequence feature vectors are used to train the support vector machine for classification recognition. The recognition accuracy is 65.2% and 89.4% for the HMDB51 and UCF101 datasets, respectively.