
Iwona Sobieraj
Postgraduate Research Student
Academic and research departments
Centre for Vision, Speech and Signal Processing (CVSSP), Department of Electrical and Electronic Engineering.Publications
Roma G, Grais EM, Simpson AJR, Sobieraj I, Plumbley MD (2016) UNTWIST: A NEW TOOLBOX FOR AUDIO SOURCE SEPARATION,
Untwist is a new open source toolbox for audio source separation. The library
provides a self-contained objectoriented framework including common source separation
algorithms as well as input/output functions, data management utilities and time-frequency
transforms. Everything is implemented in Python, facilitating research, experimentation and
prototyping across platforms. The code is available on github 1.
Sobieraj I, Plumbley MD (2016) Coupled Sparse NMF vs. Random Forest Classification for Real Life Acoustic Event Detection,Proceedings of the Detection and Classification of Acoustic Scenes and Events 2016 Workshop (DCASE2016)pp. 90-94
In this paper, we propose two methods for polyphonic Acoustic Event Detection (AED) in real life environments. The first method is based on Coupled Sparse Non-negative Matrix Factorization (CSNMF) of spectral representations and their corresponding class activity annotations. The second method is based on Multi-class Random Forest (MRF) classification of time-frequency patches. We compare the performance of the two methods on a recently published dataset TUT Sound Events 2016 containing data from home and residential area environments. Both methods show comparable performance to the baseline system proposed for DCASE 2016 Challenge on the development dataset with MRF outperforming the baseline on the evaluation dataset.
Sobieraj Iwona, Rencker Lucas, Plumbley Mark D (2018) Orthogonality-regularized masked NMF for learning on weakly labeled audio data,Proceedings of the 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)pp. 2436-2440 Institute of Electrical and Electronics Engineers (IEEE)
Non-negative Matrix Factorization (NMF) is a well established
tool for audio analysis. However, it is not well suited
for learning on weakly labeled data, i.e. data where the exact
timestamp of the sound of interest is not known. In this paper
we propose a novel extension to NMF, that allows it to extract
meaningful representations from weakly labeled audio data.
Recently, a constraint on the activation matrix was proposed
to adapt for learning on weak labels. To further improve the
method we propose to add an orthogonality regularizer of the
dictionary in the cost function of NMF. In that way we obtain
appropriate dictionaries for the sounds of interest and background
sounds from weakly labeled data. We demonstrate
that the proposed Orthogonality-Regularized Masked NMF
(ORM-NMF) can be used for Audio Event Detection of rare
events and evaluate the method on the development data from
Task2 of DCASE2017 Challenge.