Speech separation

Situation with a single speaker and diffuse crowd noise

Initial clean speech:

Mixture with reverberation and diffuse noise at 0 dB SNR:

  • 3D spatial scene made with binaural rendering (listen with headphones):
  • Mono signal:

Speech enhanced with a simple directional filter (delay-and-sum-type beamformer):

Speech enhanced with our LSTM-based system:

 

 

Situation with two speakers (25°apart) and diffuse crowd noise

Initial clean speech:

Mixture with reverberation, a competing speaker at 0 dB SIR and diffuse noise at 20 dB SNR:

  • 3D spatial scene made with binaural rendering (listen with headphones):
  • Mono signal:

Speech enhanced with a simple directional filter (delay-and-sum-type beamformer):

Speech enhanced with our LSTM-based system: