LSTM-Based Viewpoint Prediction for Multi-Quality Tiled Video Coding in Virtual Reality Streaming
PubDate: September 2020
Teams: École de technologie supérieure；Summit Tech Multimedia
Writers: Mohammadreza Jamali; Stéphane Coulombe; Ahmad Vakili; Carlos Vazquez
Virtual reality (VR) streaming is impaired by the large amount of data required to deliver 360-degree video resulting in low-quality end user experience when network bandwidth is limited, or latency is high. To address these challenges, proposed in this paper is a novel method for viewpoint prediction for long-term horizons in VR streaming. This method uses a long short-term memory (LSTM) encoder-decoder network to carry out a sequence-to-sequence prediction. To enhance the results obtained by this network, experiments are performed using viewpoint information from users on low-latency networks. By applying an effective tile-based quality assignment after viewpoint prediction, a 61% average bandwidth reduction, with respect to the transmission of the whole ERP frame, is achieved along with a high-quality viewport rendered to the end user.