Roma G, Grais EM, Simpson AJR, Sobieraj I, Plumbley MD (2016) UNTWIST: A NEW TOOLBOX FOR AUDIO SOURCE SEPARATION,
Untwist is a new open source toolbox for audio source separation. The library
provides a self-contained objectoriented framework including common source separation
algorithms as well as input/output functions, data management utilities and time-frequency
transforms. Everything is implemented in Python, facilitating research, experimentation and
prototyping across platforms. The code is available on github 1.
In this paper, we propose two methods for polyphonic Acoustic Event Detection (AED) in real life environments. The first method is based on Coupled Sparse Non-negative Matrix Factorization (CSNMF) of spectral representations and their corresponding class activity annotations. The second method is based on Multi-class Random Forest (MRF) classification of time-frequency patches. We compare the performance of the two methods on a recently published dataset TUT Sound Events 2016 containing data from home and residential area environments. Both methods show comparable performance to the baseline system proposed for DCASE 2016 Challenge on the development dataset with MRF outperforming the baseline on the evaluation dataset.
Non-negative Matrix Factorization (NMF) is a well established
tool for audio analysis. However, it is not well suited
for learning on weakly labeled data, i.e. data where the exact
timestamp of the sound of interest is not known. In this paper
we propose a novel extension to NMF, that allows it to extract
meaningful representations from weakly labeled audio data.
Recently, a constraint on the activation matrix was proposed
to adapt for learning on weak labels. To further improve the
method we propose to add an orthogonality regularizer of the
dictionary in the cost function of NMF. In that way we obtain
appropriate dictionaries for the sounds of interest and background
sounds from weakly labeled data. We demonstrate
that the proposed Orthogonality-Regularized Masked NMF
(ORM-NMF) can be used for Audio Event Detection of rare
events and evaluate the method on the development data from
Task2 of DCASE2017 Challenge.