Voice separation with an unknown number of multiple speakers
PubDate: July 10, 2020
Teams: Facebook AI;Tel-Aviv University
Writers: Eliya Nachmani;Yossi Adi;Lior Wolf
PDF: Voice separation with an unknown number of multiple speakers
Abstract
We present a new method for separating a mixed audio sequence, in which multiple voices speak
simultaneously. The new method employs gated neural networks that are trained to separate the
voices at multiple processing steps, while maintaining the speaker in each output channel fixed. A different model is trained for every number of possible speakers, and the model with the largest
number of speakers is employed to select the actual number of speakers in a given sample. Our
method greatly outperforms the current state of the art, which, as we show, is not competitive for
more than two speakers.