Researchers at Google claim to have developed a machine learning model that can separate a sound source from noisy, single-channel audio based on only a short sample of the target source. In a paper, they say their SoundFilter system can be tuned to filter arbitrary sound sources, even those it hasn’t seen during training.
The researchers believe a noise-eliminating system like SoundFilter could be used to create a range of useful technologies. For instance, Google drew on audio from thousands of its own meetings and YouTube videos to train the noise-canceling algorithm in Google Meet. Meanwhile, a team of Carnegie Mellon researchers created a “sound-action-vision” corpus to anticipate where objects will move when subjected to physical force.
SoundFilter treats the task of sound separation as a one-shot learning problem. The model receives as input the audio mixture to be filtered and a single short example of the kind of sound to be filtered out. Once trained, SoundFilter is expected to extract this kind of sound from the mixture if present.