Deep Learning in Latent Space for Video Prediction and Compression
PubDate: June 2021
Teams: University of Michigan
Writers: Bowen Liu;Yu Chen;Shiyu Liu;Hun-Seok Kim;Ann Arbor
Learning-based video compression has achieved substantial progress during recent years. The most influential approaches adopt deep neural networks (DNNs) to remove spatial and temporal redundancies by finding the appropriate lower-dimensional representations of frames in the video.
We propose a novel DNN based framework that predicts and compresses video sequences in the latent vector space. The proposed method first learns the efficient lower-dimensional latent space representation of each video frame and then performs inter-frame prediction in that latent domain. The proposed latent domain compression of individual frames is obtained by a deep autoencoder trained with a generative adversarial network (GAN). To exploit the temporal correlation within the video frame sequence, we employ a convolutional long short-term memory (ConvLSTM) network to predict the latent vector representation of the future frame. We demonstrate our method with two applications; video compression and abnormal event detection that share the identical latent
frame prediction network. The proposed method exhibits superior or competitive performance compared to the stateof-the-art algorithms specifically designed for either video compression or anomaly detection.