Softwares

Sound event detection

DCASE 2019 baseline: This is the baseline system for the task 4 of DCASE 2018 challenge. The algorithm performs sound events detection and classification.The system relies on convolutionnal and recurrent neural networks (CRNN) and a mean-teacher model to exploit a large amount of unbalanced and unlabeled training data together with a small weakly annotated (without timestamps) training set and a strongly annotated (with timestamps) synthetic set to improve system performance. Authors: Nicolas Turpault, Romain Serizel
DCASE 2018 baseline: This is the baseline system for the task 4 of DCASE 2018 challenge. The algorithm performs sound events detection and classification using weakly labeled data (without timestamps). From an audio recording, the target of the system is to provide not only the event class but also the event time boundaries given that multiple events can be present in an audio recording. Authors: Nicolas Turpault, Romain Serizel

Beta NMF: Theano based GPGPU implementation of NMF with beta-diveregence and multiplicative updates.
Group NMF: Theano based GPGPU implementation of group-NMF with class and session similarity constraints. The NMF works with beta-diveregence and multiplicative updates.
Mini batch NMF: Theano based GPGPU implementation of NMF with beta-diveregence and mini-batch multplicative updates.
Supervised (group) NMF: Python code to perform task-driven NMF and task-driven group NMF

DCASE 2018 – TASK4: The goal of this dataset is to evaluates system for the large-scale detection of sound events using weakly labeled data. The challenge is to explore the possibility to exploit a large amount of unbalanced and unlabeled training data together with a small weakly annotated training set to improve system performance.
- Reference: Large-scale weakly labeled semi-supervised sound event detection in domestic environments, Romain Serizel, Nicolas Turpault, Hamid Eghbal-Zadeh, and Ankit Parag Shah, in Proc. DCASE2018, 2018.
DCASE 2019 – TASK4: The goal of this dataset is to evaluate systems for the large-scale detection of sound events using real data either weakly labeled or unlabeled and simulated data that is strongly labeled (with time stamps). The scientific question this task is aiming to investigate is whether we really need real but partially and weakly annotated data or is using synthetic data sufficient? or do we need both?
- Reference: Sound event detection in domestic environments with weakly labeled data and soundscape synthesis,Nicolas Turpault, Romain Serizel, Ankit Parag Shah, and Justin Salamon, working paper or preprint, 2019.