More than 400 million people worldwide suffer from a debilitating hearing loss. Untreated hearing loss causes social isolation, which in turn increases risk of depression and dementia - and has tremendous societal costs. Even so, only every other person who could benefit from a hearing aid indeed uses one. One of the most sought-after improvement to hearing aids is improved intelligibility of speech in noise.
Speech separation is the process of separating speech from interfering sound, and the performance of speech separation algorithms has seen advances with the introduction of deep learning methods. Such systems need to learn, and they learn by being trained on data. Current deep learning systems are, however, poor at performing in scenarios with speakers and noises that the system did not see while it was trained. Modelling uncertainty - through probabilities - can help models learn from less data and generalize better.
The project explores enabling deep learning models to perform well in unseen scenarios using probabilistic modelling. In the project, these methods (such as variational Bayesian inference) are leveraged for learning latent representations of audio, which are, in a sense, distillations of the audio signal. The project will investigate imposing useful characteristics and structure on the learnt representations, and these representations are in turn used as the basis for a speech separation system.