(SMC-PHD) filter has been shown to be promising for
audio-visual multi-speaker tracking. Recently, the zero diffusion
particle flow (ZPF) has been used to mitigate the weight
degeneracy problem in the SMC-PHD filter. However, this
leads to a substantial increase in the computational cost due to
the migration of particles from prior to posterior distribution
with a partial differential equation. This paper proposes an alternative
method based on the non-zero diffusion particle flow
(NPF) to adjust the particle states by fitting the particle distribution
with the posterior probability density using the nonzero
diffusion. This property allows efficient computation of
the migration of particles. Results from the AV16.3 dataset
demonstrate that we can significantly mitigate the weight degeneracy
problem with a smaller computational cost as compared
with the ZPF based SMC-PHD filter.
multi-speaker tracking. Proc. 13th International Conference on Latent Variable Analysis and Signal Separation(LVA/ICA 2017), Grenoble, France, February 21-23, 2017.,In: Tichavský P, Babaie-Zadeh M, Michel O, Thirion-Moreau N (eds.), Latent Variable Analysis and Signal Separation. LVA/ICA 2017 Proceedings 13th International Conference on Latent Variable Analysis and Signal Separation(LVA/ICA 2017) 10169 pp. 344-353 Springer
PHD) filtering has been recently exploited for audio-visual (AV) based
tracking of multiple speakers, where audio data are used to inform the
particle distribution and propagation in the visual SMC-PHD filter. However, the performance of the AV-SMC-PHD filter can be affected by the
mismatch between the proposal and the posterior distribution. In this paper, we present a new method to improve the particle distribution where
audio information (i.e. DOA angles derived from microphone array measurements) is used to detect new born particles and visual information
(i.e. histograms) is used to modify the particles with particle
flow has the benefit of migrating particles smoothly from
the prior to the posterior distribution. We compare the proposed algorithm with the baseline AV-SMC-PHD algorithm using experiments on
the AV16.3 dataset with multi-speaker sequences.
(SMC-PHD) filter assisted by particle flows (PF) has been
shown to be promising for audio-visual multi-speaker tracking. A clustering step is often employed for calculating the particle flow, which leads to a substantial increase in the computational cost. To address this issue, we propose an alternative method based on the labelled non-zero particle flow (LNPF) to adjust the particle states. Results obtained from
the AV16.3 dataset show improved performance by the proposed method in terms of computational efficiency and tracking accuracy as compared with baseline AV-NPF-SMC-PHD methods.
both zero and non-zero diffusion particle flows (ZPF/NPF), and developed two new algorithms, AV-ZPF-SMC-PHD and AV-NPFSMC-
PHD, where the speaker states from the previous frames are also considered for particle relocation. The proposed algorithms are compared systematically with several baseline tracking methods using the AV16.3, AVDIAR and CLEAR datasets, and are shown to offer improved tracking accuracy and average effective sample size (ESS).
The audio-visual sequential Monte Carlo probability hypothesis density (AV-SMCPHD) ?lter is a popular baseline for multi-target tracking, offering an elegant framework for fusing audio-visual information and dealing with a varying number of speakers. However, the performance of this ?lter can be adversely affected by the weight degeneracy problem, where the weights of most of the particles may become very small, while only few remain signi?cant, during the iteration of the algorithm.
To address this issue, this thesis proposes the AV-SMC-PHD ?lter by incorporating particle ?ows de?ned in terms of the ordinary differential equation and the Fokker-Planck equation. This thesis considers both zero and non-zero diffusion particle ?ows (ZPF/NPF), and developed two new algorithms, AV-ZPF-SMC-PHD and AV-NPFSMC-PHD, where the speaker states from the previous frames are also considered for particle relocation. The particle ?ow migrates particles from the prior distribution to the posterior distribution, using a homotopy function which de?nes the ?ow in synthetic time. The proposed methods can mitigate the particle degeneracy of the AV-SMC-PHD ?lter and improve tracking accuracy.
Another issue is that the performance of the multi-speaker tracking algorithms is often degraded by mis-detection and clutter in the measurements. To address this issue, this thesis proposes an intensity particle ?ow (IPF) SMC-PHD ?lter based on the intensity function derived from the measurements, informed by the clutter density and the detection probability. The IPF-SMC-PHD ?lter improves tracking accuracy, but induces a high computational overhead, due to the requirement for computing the sum of the likelihood intensity functions and the third-order differentiation of the likelihood density. As a result, the computational complexity of IPF is proportional to the cube of the number of measurements.
To address this problem, this thesis proposes a labelled particle ?ow (LPF) algorithm where particle labels are estimated from the measurements from multiple sensors and then used to update particles and estimate speaker states. Since the LPF only uses the ?rst differentiation of the likelihood density and replaces the clustering step by the sum of particle states, LPF offers a higher computational e?ciency as compared with other particle ?ow methods where a clustering method is often used to estimate the target states. All the proposed methods are extensively evaluated using different datasets, such as AV16.3, AVDIAR and CLEAR. The results show that the weight degeneracy problem has been mitigated by our proposed methods which offer higher tracking accuracy than the baseline methods in a variety of scenarios such as occlusion and rapid movements of the speakers.