We address the problem of sparse signal reconstruction from a few noisy samples. Recently, a Covariance-Assisted Matching Pursuit (CAMP) algorithm has been proposed, improving the sparse coefficient update step of the classic Orthogonal Matching Pursuit (OMP) algorithm. CAMP allows the a-priori mean and covariance of the non-zero coefficients to be considered in the coefficient update step. In this paper, we analyze CAMP, which leads to a new interpretation of the update step as a maximum-a-posteriori (MAP) estimation of the non-zero coefficients at each step. We then propose to leverage this idea, by finding a MAP estimate of the sparse reconstruction problem, in a greedy OMP-like way. Our approach allows the statistical dependencies between sparse coefficients to be modelled, while keeping the practicality of OMP. Experiments show improved performance when reconstructing the signal from a few noisy samples.
Musical noise is a recurrent issue that appears in spectral techniques for denoising or blind source separation. Due to localised errors of estimation, isolated peaks may appear in the processed spectrograms, resulting in annoying tonal sounds after synthesis known as ?musical noise?. In this paper, we propose a method to assess the amount of musical noise in an audio signal, by characterising the impact of these artificial isolated peaks on the processed sound. It turns out that because of the constraints between STFT coefficients, the isolated peaks are described as time-frequency ?spots? in the spectrogram of the processed audio signal. The quantification of these ?spots?, achieved through the adaptation of a method for localisation of significant STFT regions, allows for an evaluation of the amount of musical noise. We believe that this will pave the way to an objective measure and a better understanding of this phenomenon.
We address the problem of decomposing several
consecutive sparse signals, such as audio time frames or image
patches. A typical approach is to process each signal sequentially
and independently, with an arbitrary sparsity level fixed for each
signal. Here, we propose to process several frames simultaneously,
allowing for more flexible sparsity patterns to be considered. We
propose a multivariate sparse coding approach, where sparsity
is enforced on average across several frames. We propose a
Multivariate Iterative Hard Thresholding to solve this problem.
The usefulness of the proposed approach is demonstrated on
audio coding and denoising tasks. Experiments show that the
proposed approach leads to better results when the signal
contains both transients and tonal components.
An OMP-like Covariance-Assisted Matching Pursuit
(CAMP) method has recently been proposed. Given a priorknowledge
of the covariance and mean of the sparse coefficients,
CAMP balances the least squares estimator and the priorknowledge
by leveraging the Gauss-Markov theorem. In this
letter, we study the performance of CAMP in the framework
of restricted isometry property (RIP). It is shown that under
some conditions on RIP and the minimum magnitude of the
nonzero elements of the sparse signal, CAMP with sparse level
K can recover the exact support of the sparse signal from
noisy measurements. l2 bounded noise and Gaussian noise are
considered in our analysis.We also discuss the extreme conditions
of noise (e.g. the noise power is infinite) to simply show the
stability of CAMP.
Clipping, or saturation, is a common nonlinear distortion in
signal processing. Recently, declipping techniques have been proposed
based on sparse decomposition of the clipped signals on a fixed dictionary,
with additional constraints on the amplitude of the clipped samples.
Here we propose a dictionary learning approach, where the dictionary
is directly learned from the clipped measurements. We propose a soft-consistency
metric that minimizes the distance to a convex feasibility
set, and takes into account our knowledge about the clipping process.
We then propose a gradient descent-based dictionary learning algorithm
that minimizes the proposed metric, and is thus consistent with the clipping
measurement. Experiments show that the proposed algorithm outperforms
other dictionary learning algorithms applied to clipped signals.
We also show that learning the dictionary directly from the clipped signals
outperforms consistent sparse coding with a fixed dictionary.
Non-negative Matrix Factorization (NMF) is a well established
tool for audio analysis. However, it is not well suited
for learning on weakly labeled data, i.e. data where the exact
timestamp of the sound of interest is not known. In this paper
we propose a novel extension to NMF, that allows it to extract
meaningful representations from weakly labeled audio data.
Recently, a constraint on the activation matrix was proposed
to adapt for learning on weak labels. To further improve the
method we propose to add an orthogonality regularizer of the
dictionary in the cost function of NMF. In that way we obtain
appropriate dictionaries for the sounds of interest and background
sounds from weakly labeled data. We demonstrate
that the proposed Orthogonality-Regularized Masked NMF
(ORM-NMF) can be used for Audio Event Detection of rare
events and evaluate the method on the development data from
Task2 of DCASE2017 Challenge.
We address the problem of recovering a sparse
signal from clipped or quantized measurements. We show how
these two problems can be formulated as minimizing the distance
to a convex feasibility set, which provides a convex and
differentiable cost function. We then propose a fast iterative
shrinkage/thresholding algorithm that minimizes the proposed
cost, which provides a fast and efficient algorithm to recover
sparse signals from clipped and quantized measurements.