A Novel Video Salient Object Detection Method via Semisupervised Motion Quality Perception
PubDate: July 2021
Teams: Qingdao University；Jiangxi University of Finance and Economics
Writers: Chenglizhao Chen; Jia Song; Chong Peng; Guodong Wang; Yuming Fang
Previous video salient object detection (VSOD) approaches have mainly focused on the perspective of network design for achieving performance improvements. However, with the recent slowdown in the development of deep learning techniques, it might become increasingly difficult to anticipate another breakthrough solely via complex networks. Therefore, this paper proposes a universal learning scheme to obtain a further 3% performance improvement for all state-of-the-art (SOTA) VSOD models. The major highlight of our method is that we propose the ‘motion quality’, a new concept for mining video frames from the ‘buffered’ testing video stream for constructing a fine-tuning set. By using our approach, all frames in this set can all well-detect their salient object by the ‘target SOTA model’ — the one we want to improve. Thus, the VSOD results of the mined set, which were previously derived by the target SOTA model, can be directly applied as pseudolearning objectives to fine-tune a completely new spatial model that has been pretrained on the widely used DAVIS-TR set. Since some spatial scenes in the buffered testing video stream are shown, the fine-tuned spatial model can perform very well for the remaining unseen testing frames, outperforming the target SOTA model significantly. Although offline model fine tuning requires additional time costs, the performance gain can still benefit scenarios without speed requirements. Moreover, its semisupervised methodology might have considerable potential to inspire the VSOD community in the future.