The DREGON dataset

Martin Strauss, Pol Mordel, Victor Miguet and myself just released the DREGON dataset. DREGON stands for DRone EGonoise and localizatiON. It consists in sounds recorded with an 8-channel microphone array embedded into a quadrotor UAV (Unmanned Aerial Vehicle). The recordings are annotated with the precise 3D position of the sound source relative to the drone as well as additional internal characteristics of the drone state such as motor speed and intertial measurements. It aims at promoting research in UAV-embedded sound source localization, notably for the application of semi-autonomous search-and-rescue with drones.

The VAST project

VAST stands for virtual acoustic space traveling and is a new paradigm for learning-based sound source localization and audio scene geometry estimation. Most existing methods that estimate the position of a sound source or other audio geometrical properties are either based on an approximate physical model (physics-driven) or on a specific-purpose calibration set (data-driven). With VAST, the idea is to learn a mapping from audio features to desired geometrical properties using a massive dataset of simulated room impulse responses. The dataset is designed to be maximally representative of the potential audio scenes the considered system may be evolving in while remaining reasonably compact. The aim is to demonstrate the good generalizability of mappings learned on a virtual datasets  to real-world data and to provide a useful tool for research teams interested in sound source localization.

 

Clément Gaultier, Saurabh Kataria, Diego Di Carlo and myself are working on the release of datasets for VAST. Two binaural datasets are already available on the project website. We co-authored two publications demonstrating this paradigm for binaural 3D sound source localization and wall absorption estimations using these datasets.