My research project
Audio-Visual Multi-speaker Tracking
Automatic tracking of multiple speakers has many applications in e.g. security surveillance, human-machine interaction, and robotics. Different sensors (such as microphones and cameras) have been used jointly to track multiple speakers in cluttered environments with a number of moving speakers and background noise. There are, however, a number of challenges associated with this problem. For example, how to estimate the unknown and time varying number of speakers? How to deal with the uncertainties associated with the audio-visual measurements, e.g. false detections, mis-detections, noise, clutters in the measurements. The aim of my project is to develop novel ideas to address these challenges, by leveraging a recent baseline developed in the Centre for Vision Speech and Signal Processing at University of Surrey, i.e. the particle flow probability hypothesis density (PHD) filtering algorithms, for fusing the audio-visual measurements, estimating the time-varying number of targets, in the presence of measurement uncertainties.