Dr Wenwu Wang

Senior Lecturer

Qualifications: BSc MEng PhD SrMIEEE

Email:
Phone: Work: 01483 68 6039
Room no: 22 AB 05

Further information

Biography

Wenwu Wang is a Senior Lecturer at Centre for Vision Speech and Signal Processing, University of Surrey, where he joined since May 2007. Prior to this, he was a Postdoctoral Research Associate at King's College London (from May 2002 to December 2003) and Cardiff University (from January 2004 to April 2005).

 

He also worked in UK industry, first as a DSP Engineer at Tao Group Ltd (now Antix Labs Ltd) (from May 2005 to August 2006), then as an R&D engineer at Creative Labs (from September 2006 to April 2007). During spring 2008, he has been a visiting scholar at the Perception and Neurodynamics Lab and the Center for Cognitive Science, The Ohio State University. He is currently a member of the MOD University Defense Research Centre in Signal Processing (since 2009) and the BBC Audio Research Partnership (since 2011). He obtained the PhD degree in April 2002 from Harbin Engineering University, China.

 

His current research interests include blind signal processing, sparse signal processing, audio-visual signal processing, machine learning and perception, and machine audition (listening). He is a Senior Member of the IEEE, and belongs to the IEEE Signal Processing, Computational Intelligence, and Circuits and Systems Societies.

 

He was an Area Chair of the 2012 European Signal Processing Conference, a Track Chair and Publicity Co-Chair of 2009 IEEE Statistical Signal Processing Workshop, Program Co-Chair of the 2009 IEEE Global Congress on Intelligent Systems. He has been a Session Chair for numerous conferences including ICASSP 2012 and EUSIPCO 2012.

 

He won the DSTL Best Solution Award (with Qingju Liu) on the DSTL Challenge Workshop for the signal processing challenge “under-sampled signal recognition” in 2012, and the Best Student Paper Award nomination (with Qingju Liu) on the 9th International Conference on Latent Variable Analysis and Signal Separation in 2010, and the Hot Paper (feature article) on the Wiley/IEEE worldwide advert for publications in signal and image processing in 2008.

 

Please visit my personal page for more information, including downloadable publications.

Research Interests

Research Grants

I appreciate the financial support for my research from the following bodies (since 2008): Engineering and Physical Science Research Council (EPSRC), Ministry of Defence (MOD), Defence Science and Technology Laboratory (DSTL), Home Office (HO), Royal Academy of Engineering (RAENG), National Natural Science Foundation of China (NSFC), the University Research Support Fund (URSF), and the Ohio State University (OSU). [Total awards to Surrey as PI/CI approximately £1.5M (as PI: £1.1M, as CI: £400K). As PI/CI on a total grant portfolio: approximately £4.5M]

  • 04/2013-04/2018, "Signal processing solutions to a networked battlespace", EPSRC and DSTL (signal processing call). [jointly with Loughborough University, University of Strathclyde, and Cardiff University.]
  • 12/2012-09/2013, "Audio-visual cues based attention switching for machine listening", MILES and EPSRC (feasibility study). [jointly with School of Psychology and Department of Computing.]
  • 11/2012-07/2013, "Audio-visual blind source separation", NSFC (international collaboration scheme). [jointly with Nanchang University, China.]
  • 12/2011-03/2012, "Enhancement of audio using video", HO (pathway to impact). [jointly with University of East Anglia.]
  • 10/2010-10/2013, "Audio and video based speech separation for multiple moving sources within a room environment", EPSRC(responsive mode). [jointly with Loughborough Univeristy.]
  • 10/2009-10/2012, "Multimodal blind source separation for robot audition", EPSRC and DSTL (signal processing call).
  • 05/2008-06/2008, "Convolutive non-negative sparse coding", RAENG (international travel grant).
  • 02/2008-06/2008, "Convolutive non-negative matrix factorization", URSF (small grant).
  • 02/2008-03/2008, "Computational audition", OSU (visiting scholarship).

Research Personnel

   Current Postdoc Fellow

  • Dr Mark Barnard (10/2010 - ): Audio-Visual Speech Separation of Multiple Moving Sources (Pricipal Supervisor: Dr Wenwu Wang; Co-supervisor: Prof Josef Kittler. External Collaborators: Prof Jonathon Chambers, Loughborough University; Dr Sangarapillai Lambotharan, Loughborough University; Prof Christian Jutten, Grenoble, France, and Dr Rivet Bertrand, Grenoble, France)

   Current PhD Students

  • Jing Dong: Sparse representations for audio-visual signal processing (Principal Supervisor: Dr Wenwu Wang; Co-Supervisor: Dr Philip Jackson; External Collaborator: Dr Wei Dai, Imperial College London)
  • Volkan Kilic: Robust audio visual tracking of multiple moving sources for robot audition (Principal Supervisor: Dr Wenwu Wang; Co-Supervisors: Prof Josef Kittler and Dr Mark Barnard)
  • Shahrzad Shapoori: Tensor factorization in EEG signal processing (Principal Supervisor: Dr Saeid Sanei, Department of Computing; Co-Supervisor: Dr Wenwu Wang)
  • Amran Abdul Hadi: Audio-visual fusion for convolutive source separation (Principal Supervisor: Dr Saeid Sanei, Department of Computing; Co-Supervisor: Dr Wenwu Wang)
  • Philip Coleman: Cross-talk cancellation in spatial audio (Principal supervisor: Dr Philip Jackson; Co-Supervisor: Dr Wenwu Wang)
  • Marek Olik: Sound zone creation for spatial audio (Principal Supervisor: Dr Philip Jackson; Co-Supervisor: Dr Wenwu Wang)
  • Syed Zubair: Dictionary learning and sparse representation for signal classification (Principal Supervisor: Dr Wenwu Wang; Co-Supervisor: Dr Philip Jackson; Internal collaborator: Dr Fei Yan; External collaborator: Dr Wei Dai, Imperial College London)
  • Atiyeh Alinaghi: Joint sound source localisation and separation (Principal Supervisor: Dr Philip Jackson; Co-Supervisor: Dr Wenwu Wang)
  • Qingju Liu: Multimodal blind source separation for robot audition (Principal Supervisor: Dr Wenwu Wang; Co-Supervisors: Dr Philip Jackson, Prof Josef Kittler; External collaborator: Prof Jonathon Chambers, Loughborough University) [Qingju Liu was the winner of the Best Solution Award on the DSTL Challenge Workshop for the signal processing challenge "Undersampled Signal Recognition", announced on the SSPD 2012 conference, London, September 25-27, 2012.]
  • Tao Xu: Compressed sensing based signal recovery and its application in auditory scene analysis (Principal Supervisor: Dr Wenwu Wang; Co-supervisor: Dr Philip Jackson; External collaborator: Dr Wei Dai, Imperial College London)

   Current Academic Visitors

  • Dr Ye Zhang: Analysis dictionary learning and source separation (Associate Professor, Nanchang University, Nanchang, China.)
  • Xiaoyi Chen: Convolutive blind source separation of underwater acoustic mixtures (PhD student, Northwestern Polytechnical University, Xi'an, China.)

   Former PhD Graduate

  • Rakkrit Duangsoithong (PhD, Sept 2012): Feature selection and causal discovery for ensemble classifiers (Principal Supervisor: Dr Terry Windeatt; Co-Supervisor: Dr Wenwu Wang)
  • Tariqullah Jan (PhD, Feb 2012): Blind convolutive speech separation and dereverberation (Principal Supervisor: Dr Wenwu Wang; Co-supervisor: Prof Josef Kittler; External collaborator: Prof DeLiang Wang, The Ohio State University)

   Former MSc Graduate

  • Xiao Han (MSc, 2012, awarded Distinction); Project: Underdetermined reverberant speech separation
  • Jian Han (MSc, 2012, awarded Distinction); Project: Microphone array based acoustic tracking of multiple moving speakers (co-supervised with Dr Mark Barnard)
  • Tianyu Feng (MSc, 2012); Project: Multi-pitch estimation and tracking
  • Yuli Ling (MSc, 2012); Project: Audio event detection from sound mixtures
  • Danyang Shen (MSc, 2012); Project: Audio-visual tracking of multiple moving speakers (co-supervised with Dr Mark Barnard)
  • Kai Song (MSc, 2012); Project: Environment recognition from sound scenes (co-supervised with Dr Fei Yan)
  • Xinpu Han (MSc, 2012); Project: Compressed sensing for natural image coding
  • Steven Grima (MSc, 2011, awarded Distinction); Project: Multimodal tracking of multiple moving sources (co-supervised with Dr Mark Barnard)
  • Anil Lal (MSc, 2011, awarded Distinction); Project: Monaural music sound separation using spectral envelop template and isolated note information
  • Xi Luo (MSc, 2011, awarded Distinction); Project: Reverberant speech enhancement
  • Yunyi Wang (MSc, 2011); Project: Compressed sensing for image coding
  • Ritesh Agarwal (MSc, 2011); Project: Multiple pitch tracking
  • Yichen Li (MSc, 2011); Project: Environmental sound recognition (co-supervised with Dr Fei Yan)
  • Tengxu Yang (MSc, 2011); Project: Ideal binary mask estimation in computational auditory scene analysis
  • Jin Ke (MSc, 2011); Project: Audio-visual tracking and localisation of moving speakers (co-supervised with Dr Mark Barnard)
  • Zijian Zhang (MSc, 2011); Project: Convolutive blind source separation of speech mixtures
  • Hafiz Mustafa (MSc, 2010); Project: Single channel music sound separation

   Former Visitors

  • Stefan Soltuz (PhD, 10/2008 -07/2009): Non-negative matrix factorization for music audio separation (Co-supervisor: Dr Philip Jackson)
  • Yanfeng Liang (MSc, 05/2009): Adaptive signal processing for clutter removal in radar images (Main supervisor: Prof Jonathon Chambers, Loughborough University)

Research Collaborations

  Academic:

  • Loughborough University (UK)
  • RIKEN (Japan)
  • Ohio State University (USA)
  • Imperial College London (UK)
  • Cardiff University (UK)
  • Strathclyde University (UK)
  • University of East Anglia (UK)
  • Nanchang University (China)
  • Northwestern Polytechnical University (China)
  • RMIT University (Australia)
  • Gipsa-lab (France)
  • Nanyang Technological University (Singapore)

   Industrial:

  • Dstl
  • BBC
  • Thales
  • Qinetiq
  • Texas Instruments
  • Stellar
  • Digital Barriers
  • Selex Galileo
  • PrismTech
  • Steepest Ascent

Research Publications

  • Check the full publication list here

 

 

Publications

Highlights

  • Wang W. (2011) Preface of Machine Audition: Principles, Algorithms and Systems. in Wang W (ed.) Machine Audition: Principles, Algorithms and Systems Information Science Reference , pp. xv-xxi.

    Abstract

    "This book covers advances in algorithmic developments, theoretical frameworks, andexperimental research findings to assist professionals who want an improved ...

  • Jan T, Wang W, Wang D. (2011) 'A Multistage Approach to Blind Separation of Convolutive Speech Mixtures'. Speech Communication, 53 (4), pp. 524-539.
  • Wang W. (2010) Instantaneous versus Convolutive Non-negative Matrix Factorization: Models, Algorithms and Applications to Audio Pattern Separation. in Wang W (ed.) Machine Audition: Principles, Algorithms and Systems Information Science Reference Article number 15 , pp. 353-370.
  • Wang W. (2010) Machine Audition: Principles, Algorithms and Systems. New York, USA : Information Science Reference
  • Jan T, Wang W. (2010) Cocktail Party Problem: Source Separation Issues and Computational Methods. in Wang W (ed.) Machine Audition: Principles, Algorithms and Systems New York, USA : Information Science Reference Article number 3 , pp. 61-79.
  • Zhou S, Wang W. (2009) IEEE/WRI Global Congress on Intelligent Systems Proceedings. USA : IEEE Computer Society
  • Wang W, Cichocki A, Chambers JA. (2009) 'A multiplicative algorithm for convolutive non-negative matrix factorization based on squared euclidean distance'. IEEE Transactions on Signal Processing, 57 (7), pp. 2858-2864.
  • Wang W, Cichocki A, Mørup M, Smaragdis P, Zdunek R. (2008) 'Advances in nonnegative matrix and tensor factorization'. Hindawi Publishing Corporation Computational Intelligence and Neuroscience, 2008 Article number 852187
  • Wang W, Luo Y, Chambers JA, Sanei S. (2008) 'Note onset detection via nonnegative factorization of magnitude spectrum'. HINDAWI PUBLISHING CORPORATION EURASIP JOURNAL ON ADVANCES IN SIGNAL PROCESSING, Article number ARTN 231367

Journal articles

  • Naik GR, Wang W. (2012) 'Audio analysis of statistically instantaneous signals with mixed Gaussian probability distributions'. International Journal of Electronics, 99 (10), pp. 1333-1350.
  • Liu Q, Wang W, Jackson P. (2012) 'Use of bimodal coherence to resolve the permutation problem in convolutive BSS'. Elsevier Signal Processing, 92 (8), pp. 1916-1927.
  • Dai W, Xu T, Wang W. (2012) 'Simultaneous codeword optimization (SimCO) for dictionary update and learning'. IEEE Transactions on Signal Processing, 60 (12), pp. 6340-6353.
  • Mohsen Naqvi S, Khan MS, Chambers JA, Wang W, Barnard M. (2012) 'Multimodal (audio-visual) source separation exploiting multi-speaker tracking, robust beamforming and time-frequency masking'. IET Signal Processing, 6 (5), pp. 466-477.
  • Jan T, Wang W, Wang D. (2011) 'A Multistage Approach to Blind Separation of Convolutive Speech Mixtures'. Speech Communication, 53 (4), pp. 524-539.
  • Wang W, Cichocki A, Chambers JA. (2009) 'A multiplicative algorithm for convolutive non-negative matrix factorization based on squared euclidean distance'. IEEE Transactions on Signal Processing, 57 (7), pp. 2858-2864.
  • Wang W, Luo Y, Chambers JA, Sanei S. (2008) 'Note onset detection via nonnegative factorization of magnitude spectrum'. HINDAWI PUBLISHING CORPORATION EURASIP JOURNAL ON ADVANCES IN SIGNAL PROCESSING, Article number ARTN 231367
  • Wang W, Cichocki A, Mørup M, Smaragdis P, Zdunek R. (2008) 'Advances in nonnegative matrix and tensor factorization'. Hindawi Publishing Corporation Computational Intelligence and Neuroscience, 2008 Article number 852187
  • Luo Y, Wang W, Chambers JA, Lambotharan S, Proudler I. (2006) 'Exploitation of source nonstationarity in underdetermined blind source separation with advanced clustering techniques'. IEEE Transactions on Signal Processing, 54 (6 I), pp. 2198-2212.
  • Jafari MG, Wang W, Chambers JA, Hoya T, Cichocki A, Cichocki A. (2006) 'Sequential blind source separation based exclusively on second-order statistics developed for a class of periodic signals'. IEEE Transactions on Signal Processing, 54 (3), pp. 1028-1040.
  • Yuan L, Wang W, Chambers JA, Yuan L, Wang W. (2005) 'Variable step-size sign natural gradient algorithm for sequential blind source separation'. IEEE Signal Processing Letters, 12 (8), pp. 589-592.
  • Wang W, Sanei S, Chambers JA. (2005) 'Penalty function-based joint diagonalization approach for convolutive blind separation of nonstationary sources'. IEEE Transactions on Signal Processing, 53 (5), pp. 1654-1669.
  • Shoker L, Sanei S, Wang W, Chambers JA. (2005) 'Removal of eye blinking artifact from the electro-encephalogram, incorporating a new constrained blind source separation algorithm.'. Med Biol Eng Comput, England: 43 (2), pp. 290-295.
  • Wang W, Jafari M, Sanei S, Chambers J. (2004) 'Blind Separation of Convolutive Mixtures of Cyclostationary Signals'. International Journal of Adaptive Control and Signal Processing, 18 (3), pp. 279-298.

Conference papers

  • Liu Q, Wang W, Jackson P, Barnard M. (2012) 'Reverberant Speech Separation Based on Audio-visual Dictionary Learning and Binaural Cues'. Proc. of IEEE Statistical Signal Processing Workshop (SSP), Ann Abor, USA: IEEE Statistical Signal Processing Workshop (SSP)
  • Barnard M, Wang W, Kittler J, Naqvi SM, Chambers JA. (2012) 'A Dictionary Learning Approach to Tracking'. International Conference on Acoustics, Speech and Signal Processing,
  • Dai W, Xu T, Wang W. (2012) 'Dictionary learning and update based on simultaneous codeword optimization (SimCO)'. ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, , pp. 2037-2040.
  • Jan T, Wang W. (2012) 'Joint blind dereverberation and separation of speech mixtures'. European Signal Processing Conference, , pp. 2343-2347.
  • Jan T, Wang W. (2012) 'Blind reverberation time estimation based on Laplace distribution'. European Signal Processing Conference, , pp. 2050-2054.
  • Xu T, Wang W. (2011) 'Methods for learning adaptive dictionary in underdetermined speech separation'. IEEE Proceedings of MLSP2011, Beijing, China: 2011 IEEE International Workshop on Machine Learning for Signal Processing, pp. 1-6.
  • Wang W, Mustafa H. (2011) 'Single channel music sound separation based on spectrogram decomposition and note classification'. Springer Lecture Notes in Computer Science: Exploring Music Contents, Malaga, Spain: CMMR 2010: 7th International Symposium 6684, pp. 84-101.

    Abstract

    Separating multiple music sources from a single channel mixture is a challenging problem. We present a new approach to this problem based on non-negative matrix factorization (NMF) and note classification, assuming that the instruments used to play the sound signals are known a priori. The spectrogram of the mixture signal is first decomposed into building components (musical notes) using an NMF algorithm. The Mel frequency cepstrum coefficients (MFCCs) of both the decomposed components and the signals in the training dataset are extracted. The mean squared errors (MSEs) between the MFCC feature space of the decomposed music component and those of the training signals are used as the similarity measures for the decomposed music notes. The notes are then labelled to the corresponding type of instruments by the K nearest neighbors (K-NN) classification algorithm based on the MSEs. Finally, the source signals are reconstructed from the classified notes and the weighting matrices obtained from the NMF algorithm. Simulations are provided to show the performance of the proposed system. © 2011 Springer-Verlag Berlin Heidelberg.

  • Alinaghi A, Wang W, Jackson PJB. (2011) 'Integrating binaural cues and blind source separation method for separating reverberant speech mixtures'. ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, , pp. 209-212.

    Abstract

    This paper presents a new method for reverberant speech separation, based on the combination of binaural cues and blind source separation (BSS) for the automatic classification of the time-frequency (T-F) units of the speech mixture spectrogram. The main idea is to model interaural phase difference, interaural level difference and frequency bin-wise mixing vectors by Gaussian mixture models for each source and then evaluate that model at each T-F point and assign the units with high probability to that source. The model parameters and the assigned regions are refined iteratively using the Expectation-Maximization (EM) algorithm. The proposed method also addresses the permutation problem of the frequency domain BSS by initializing the mixing vectors for each frequency channel. The EM algorithm starts with binaural cues and after a few iterations the estimated probabilistic mask is used to initialize and re-estimate the mixing vector model parameters. We performed experiments on speech mixtures, and showed an average of about 0.8 dB improvement in signal-to-distortion (SDR) over the binaural-only baseline. © 2011 IEEE.

  • Liu Q, Wang W. (2011) 'Blind source separation and visual voice activity detection for target speech extraction'. Proceedings of 2011 3rd International Conference on Awareness Science and Technology, iCAST 2011, , pp. 457-460.
  • Liu Q, Wang W. (2011) 'Blind source separation and visual voice activity detection for target speech extraction'. IEEE Proceedings of 2011 3rd International Conference on Awareness Science and Technology, Dalian, China: iCAST 2011, pp. 457-460.

    Abstract

    Despite being studied extensively, the performance of blind source separation (BSS) is still limited especially for the sensor data collected in adverse environments. Recent studies show that such an issue can be mitigated by incorporating multimodal information into the BSS process. In this paper, we propose a method for the enhancement of the target speech separated by a BSS algorithm from sound mixtures, using visual voice activity detection (VAD) and spectral subtraction. First, a classifier for visual VAD is formed in the off-line training stage, using labelled features extracted from the visual stimuli. Then we use this visual VAD classifier to detect the voice activity of the target speech. Finally we apply a multi-band spectral subtraction algorithm to enhance the BSS-separated speech signal based on the detected voice activity. We have tested our algorithm on the mixtures generated artificially by the mixing filters with different reverberation times, and the results show that our algorithm improves the quality of the separated target signal. © 2011 IEEE.

  • Zubair S, Wang W. (2011) 'Audio classification based on sparse coefficients'. IET Seminar Digest, 2011 (4)
  • Liu Q, Naqvi SM, Wang W, Jackson PJB, Chambers J. (2011) 'Robust feature selection for scaling ambiguity reduction in audio-visual convolutive BSS'. European Signal Processing Conference, Barcelona, Spain: 19th European Signal Processing Conference 2011 (EUSIPCO 2011), pp. 1060-1064.
  • Liu Q, Wang W, Jackson PJB. (2011) 'A visual voice activity detection method with adaboosting'. IET IET Seminar Digest, London, UK: Sensor Signal Processing for Defence (SSPD 2011) 2011 (4)

    Abstract

    Spontaneous speech in videos capturing the speaker's mouth provides bimodal information. Exploiting the relationship between the audio and visual streams, we propose a new visual voice activity detection (VAD) algorithm, to overcome the vulnerability of conventional audio VAD techniques in the presence of background interference. First, a novel lip extraction algorithm combining rotational templates and prior shape constraints with active contours is introduced. The visual features are then obtained from the extracted lip region. Second, with the audio voice activity vector used in training, adaboosting is applied to the visual features, to generate a strong final voice activity classifier by boosting a set of weak classifiers. We have tested our lip extraction algorithm on the XM2VTS database (with higher resolution) and some video clips from YouTube (with lower resolution). The visual VAD was shown to offer low error rates.

  • Jan T, Wang W. (2011) 'Empirical mode decomposition for joint denoising and dereverberation'. European Signal Processing Conference, , pp. 206-210.
  • Naqvi SM, Khan MS, Chambers JA, Liu Q, Wang W. (2011) 'Multimodal blind source separation with a circular microphone array and robust beamforming'. European Signal Processing Conference, , pp. 1050-1054.
  • Liu Q, Wang W, Jackson P. (2010) 'Audio-visual Convolutive Blind Source Separation'. London : IEEE Proc. Sensor Signal Processing for Defence (SSPD 2010), London, UK: Sensor Signal Processing for Defence

    Abstract

    We present a novel method for speech separation from their audio mixtures using the audio-visual coherence. It consists of two stages: in the off-line training process, we use the Gaussian mixture model to characterise statistically the audio-visual coherence with features obtained from the training set; at the separation stage, likelihood maximization is performed on the independent component analysis (ICA)-separated spectral components. To address the permutation and scaling indeterminacies of the frequency-domain blind source separation (BSS), a new sorting and rescaling scheme using the bimodal coherence is proposed.We tested our algorithm on the XM2VTS database, and the results show that our algorithm can address the permutation problem with high accuracy, and mitigate the scaling problem effectively.

  • Liu Q, Wang W, Jackson PJB. (2010) 'Use of Bimodal Coherence to Resolve Spectral Indeterminacy in Convolutive BSS'. Springer Lecture Notes in Computer Science (LNCS 6365), St. Malo, France: 9th International Conference on Latent Variable Analysis and Signal Separation (formerly the International Conference on Independent Component Analysis and Signal Separation) 6365/2010, pp. 131-139.

    Abstract

    Recent studies show that visual information contained in visual speech can be helpful for the performance enhancement of audio-only blind source separation (BSS) algorithms. Such information is exploited through the statistical characterisation of the coherence between the audio and visual speech using, e.g. a Gaussian mixture model (GMM). In this paper, we present two new contributions. An adapted expectation maximization (AEM) algorithm is proposed in the training process to model the audio-visual coherence upon the extracted features. The coherence is exploited to solve the permutation problem in the frequency domain using a new sorting scheme. We test our algorithm on the XM2VTS multimodal database. The experimental results show that our proposed algorithm outperforms traditional audio-only BSS.

  • Liu Q, Wang W, Jackson PJB. (2010) 'Bimodal Coherence based Scale Ambiguity Cancellation for Target Speech Extraction and Enhancement'. ISCA-International Speech Communication Association Proceedings of 11th Annual Conference of the International Speech Communication Association 2010, Makuhari, Japan: 11th Annual Conference of the International Speech Communication Association 2010, pp. 438-441.

    Abstract

    We present a novel method for extracting target speech from auditory mixtures using bimodal coherence, which is statistically characterised by a Gaussian mixture modal (GMM) in the offline training process, using the robust features obtained from the audio-visual speech. We then adjust the ICA-separated spectral components using the bimodal coherence in the time-frequency domain, to mitigate the scale ambiguities in different frequency bins. We tested our algorithm on the XM2VTS database, and the results show the performance improvement with our proposed algorithm in terms of SIR measurements.

  • Xu T, Wang W. (2010) 'Learning Dictionary for Underdetermined Blind Speech Separation Based on Compressed Sensing Method'. Proc. INSPIRE Conference on Information Representation and Estimation, London, UK: INSPIRE 2010
  • Xu T, Wang W. (2010) 'A block-based compressed sensing method for underdetermined blind speech separation incorporating binary mask'. IEEE Proceedings of 2010 IEEE International Conference on Acoustics, Speech, and Signal Processing, Dallas, USA: ICASSP 2010, pp. 2022-2025.
  • Xu T, Wang W. (2009) 'A compressed sensing approach for underdetermined blind audio source separation with sparse representation'. IEEE IEEE Workshop on Statistical Signal Processing Proceedings, Cardiff, UK: SSP '09, pp. 493-496.
  • Jan T, Wang W, Wang D. (2009) 'A multistage approach for blind separation of convolutive speech mixtures'. IEEE IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, Taipei, Taiwan: ICASSP'09, pp. 1713-1716.
  • Soltuz SM, Wang W, Jackson PJB. (2009) 'A HYBRID ITERATIVE ALGORITHM FOR NONNEGATIVE MATRIX FACTORIZATION'. IEEE 2009 IEEE/SP 15TH WORKSHOP ON STATISTICAL SIGNAL PROCESSING, VOLS 1 AND 2, Cardiff, WALES: 15th IEEE/SP Workshop on Statistical Signal Processing, pp. 409-412.
  • Liang Y, Wang W, Chambers J. (2009) 'Adaptive signal processing techniques for clutter removal in radar-based navigation systems'. IEEE Conference Record of the 43rd Asilomar Conference on Signals, Systems and Computers, Pacific Grove, USA: Asilomar 2009, pp. 1855-1858.
  • Jan T, Wang W, Wang D. (2008) 'Binaural Speech Separation Based on Convolutive ICA and Ideal Binary Mask Coupled with Cepstral Smoothing'. Proc. 8th IMA International Conference on Mathematics in Signal Processing, Cirencester, UK: IMA 2008
  • Wang W. (2008) 'One Microphone Audio Source Separation Using Convolutive Non-negative Matrix Factorization with Sparseness Constraints'. Proc. 8th IMA International Conference on Mathematics in Signal Processing, Cirencester, UK: IMA 2008
  • Zou X, Wang W, Kittler J. (2008) 'Non-negative Matrix Factorization for Face Illumination Analysis'. Proc. ICA Research Network International Workshop, Liverpool, UK: ICARN 2008, pp. 52-55.
  • Wang W, Zou X. (2008) 'Non-Negative Matrix Factorization based on Projected Nonlinear Conjugate Gradient Algorithm'. Proc. ICA Research Network International Workshop, Liverpool, UK: ICARN 2008, pp. 5-8.
  • Wang W. (2008) 'Convolutive non-negative sparse coding'. IEEE Proceedings of the International Joint Conference on Neural Networks, Hong Kong: IJCNN 2008, pp. 3681-3684.
  • Zhang Y, Chambers JA, Wang W, Kendrick P, Cox TJ. (2007) 'A new variable step-size LMS algorithm with robustness to nonstationary noise'. IEEE IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, Hawaii, USA: ICASSP'07 3, pp. III-1349-III-1352.
  • Wenwu W, Yuhui L, Chambers JA, Saeid S. (2007) 'Non-negative matrix factorization for note onset detection of audio signals'. IEEE Proceedings of the 2006 16th IEEE Signal Processing Society Workshop on Machine Learning for Signal Processing, Arlington, USA: MSLP 2006, pp. 447-452.
  • Wang W. (2007) 'Squared Euclidean distance based convolutive non-negative matrix factorization with multiplicative learning rules for audio pattern separation'. IEEE IEEE International Symposium on Signal Processing and Information Technology, Giza, Egypt: ISSPIT 2007, pp. 347-352.
  • Wang W, Hicks Y, Sanei S, Chambers J, Cosker D. (2005) 'Video assisted speech source separation'. IEEE IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, Philadelphia, USA: ICASSP'05 V, pp. 425-428.
  • Yuan L, Sang E, Wang W, Chambers JA. (2005) 'An effective method to improve convergence for sequential blind source separation'. SPRINGER-VERLAG BERLIN ADVANCES IN NATURAL COMPUTATION, PT 1, PROCEEDINGS, Changsha, PEOPLES R CHINA: 1st International Conference on Natural Computation (ICNC 2005) 3610, pp. 199-208.
  • Wang W, Chambers J, Sanei S. (2004) 'Subband Decomposition for Blind Speech Separation Using a Cochlear Filterbank'. Proc. IMA 6th International Conference on Mathematics in Signal Processing, Cirencester, UK: IMA 2004, pp. 207-210.
  • Wang W, Chambers J, Sanei S. (2004) 'Penalty Function Based Joint Diagonalization Approach for Convolutive Constrained BSS of Nonstationary Signals'. Technische Universität Wien Proc. 12th European Signal Processing Conference, Vienna, Austria: EUSIPCO 2004
  • Sanei S, Wang W, Chambers J. (2004) 'A Coupled HMM for Solving the Permutation Problem in Frequency Domain BSS'. IEEE Proc. IEEE International Conference on Acoustics, Speech and Signal Processing, Montreal, Canada: ICASSP 2004, pp. 565-568.
  • Chambers J, Wang W. (2004) 'Frequency domain blind source separation'. IET Seminar Digest, 2004 (10774), pp. 2-2.
  • Wang W, Sanei S, Chambers JA. (2004) 'A novel hybrid approach to the permutation problem of frequency domain blind source separation'. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 3195, pp. 532-539.
  • Wang W, Chambers JA, Sanei S. (2004) 'Penalty function approach for constrained convolutive blind source separation'. Springer Lecture Notes in Computer Science: Independent Component Analysis and Blind Signal Separation, Granada, Spain: ICA 2004: 5th International Conference 3195, pp. 661-668.
  • Sanei S, Spyrou L, Wang W, Chambers JA. (2004) 'Localization of P300 sources in schizophrenia patients using constrained BSS'. Springer Lecture Notes in Computer Science: Independent Component Analysis and Blind Signal Separation, Malaga, Spain: ICA 2004: 5th International Conference 3195, pp. 177-184.
  • Wang W, Sanei S, Chambers J. (2003) 'Hybrid Scheme of Convolutive BSS and Beamforming for Speech Signal Separation Using Psychoacousitcs Filtering'. Proc. International Conference on Control Science and Engineering, Harbin, China: ICCSE 2003
  • Wang W, Jafari M, Sanei S, Chambers J. (2003) 'Blind Separation of Convolutive Mixtures of Cyclostationary Sources Using an Extended Natural Gradient Method'. IEEE Proc. IEEE 7th International Symposium on Signal Processing and its Applications, Paris, France: ISSPA 2003 2, pp. 93-96.
  • Wang W, Sanei S, Chambers J. (2003) 'A Joint Diagonalization Method for Convolutive Blind Separation of Nonstationary Sources in the Frequency Domain'. Proc. 4th International Symposium on Independent Component Analysis and Blind Signal Separation, Nara, Japan: ICA 2003, pp. 939-944.

Books

  • Wang W. (2010) Machine Audition: Principles, Algorithms and Systems. New York, USA : Information Science Reference
  • Zhou S, Wang W. (2009) IEEE/WRI Global Congress on Intelligent Systems Proceedings. USA : IEEE Computer Society

Book chapters

  • Wang W. (2011) 'Preface of Machine Audition: Principles, Algorithms and Systems'. in Wang W (ed.) Machine Audition: Principles, Algorithms and Systems Information Science Reference , pp. xv-xxi.

    Abstract

    "This book covers advances in algorithmic developments, theoretical frameworks, andexperimental research findings to assist professionals who want an improved ...

  • Jan T, Wang W. (2010) 'Cocktail Party Problem: Source Separation Issues and Computational Methods'. in Wang W (ed.) Machine Audition: Principles, Algorithms and Systems New York, USA : Information Science Reference Article number 3 , pp. 61-79.
  • Wang W. (2010) 'Instantaneous versus Convolutive Non-negative Matrix Factorization: Models, Algorithms and Applications to Audio Pattern Separation'. in Wang W (ed.) Machine Audition: Principles, Algorithms and Systems Information Science Reference Article number 15 , pp. 353-370.

Teaching

2012/2013

  • EEM.ivc - Image and Video Compression (Spring 2013)
  • EEM.sap - Speech and Audio Processing & Coding (Autumn 2012)
  • EE2.mpr - Media (Audio-Visual) Processing (Spring 2013)
  • EE1.pro - Programming: Labs & Marking (Spring 2013)

2011/2012

  • EEM.ivc - Image and Video Compression (Spring 2012)
  • EEM.sap - Speech and Audio Processing & Coding (Autumn 2011)
  • EE2.mpr - Media (Audio-Visual) Processing (Autumn 2011)
  • EE1.pro - Programming: Labs & Marking (Autumn 2011 & Spring 2012)
  • EE1.eps - EDPS: Basic Computing Skills (Autumn 2011)

2010/2011

  • EEM.ivc - Image and Video Compression (Spring 2011)
  • EEM.sap - Speech and Audio Processing & Coding (Autumn 2010)
  • EE1.pro - Programming: Labs & Marking (Autumn 2010 & Spring 2011)
  • EE1.eps - EDPS: Basic Computing Skills (Autumn 2010)

2009/2010

  • EEM.ivc - Image and Video Compression (Spring 2010)
  • EEM.sap - Speech and Audio Processing & Coding (Autumn 2009)
  • EE1.pro - Programming: Labs & Marking (Autumn 2009 & Spring 2010)
  • EE1.eps - EDPS: Basic Computing Skills (Autumn 2009)

2008/2009

  • EEM.ivc - Image and Video Compression (Spring 2009)
  • EEM.sap - Speech and Audio Processing & Coding (Autumn 2008)
  • EE1.pro - Introduction to Programming: Labs & Marking (Autumn 2008 & Spring 2009)

2007/2008

  • EEM.sap - Speech and Audio Processing & Coding (Autumn 2007)
  • EE1.pca - C Programming Labs (Autumn 2007 & Spring 2008)

 

Note: EEM - Master students module; EE1 - First-year undergraduate students module.

Departmental Duties

Selected Recent Activities

  • Organising Committee Member, CISP 2013, London, December, 2-3, 2013.
  • Program Committee Member, BMVC 2013, Bristol, UK, Sept 9-13, 2013.
  • Program Committee Member, SIP 2013, Banff, Canada, July 17-19, 2013.
  • Special Session Co-Chair (with Jonathon Chambers and Zoran Cvetkovic), DSP 2013, Santorini, Greece, July, 1-3, 2013.
  • Program Committee Member, ICICIP 2013, Beijing, China, June, 09-11, 2013.
  • Program Committee Member, ICASSP 2013, Vancovar, Canada, May, 26-31, 2013.
  • Tutorial Speaker (with Wei Dai and Boris Mailhe), ICASSP 2013, "Dictionary Learning for Sparse Representations: Algorithms and Applications", Vancovar, Canada, May, 26-31, 2013.
  • Program Committee Member, SENSORNETS 2013, Barcelona, Spain, February 19-21, 2013.
  • External PhD Examiner, PhD Thesis: "Sparse Approximation and Dictionary Learning with Applications to Audio Signals", Queen Mary University of London, December 2012.
  • Independent Expert, European Commission, grant evaluation, November 2012.
  • Session Chair, SSPD 2012, "Sensor Arrays", London, UK, 25-27 September, 2012.
  • Program Committee Member, ISCSLP 2012, Hong Kong, China, December 5-8, 2012.
  • Program Committee Member, SSPD 2012, London, UK, 25-27 September, 2012.
  • Session Chair, EUSIPCO 2012, "P-ML-1: Machine Learning", Bucharest, Romania, 27 - 31 August, 2012.
  • Session Co-Chair (with Ali Taylan Cemgil), ICASSP 2012, "MLSP-L3: Applications in Audio, Speech, and Image Processing", Kyoto, Japan, 25-30 March, 2012.
  • Program Committee Member, S+SSPR 2012, Hiroshima, Japan, 7 - 9 November, 2012.
  • Program Committee Member, CISP 2012, Chongqing, China, 16-18 October, 2012.
  • Program Committee Member, UKCI 2012, Edinburgh, UK, 5-11 September, 2012.
  • Area Chair, EUSIPCO 2012, Bucharest, Romania, 27 - 31 August, 2012.
  • Program Committee Member, BMVC 2012, Guildford, UK, 3 - 7 September, 2012.
  • Program Committee Member, EUSIPCO 2012, Bucharest, Romania, 27 - 31 August, 2012.
  • Program Committee Member, SIP 2012, Honolulu, USA, 20 - 22 August, 2012.
  • Program Committee Member, ISNN 2012, Shenyang, China, 11-14 July, 2012.
  • Program Committee Member, ICSAI 2012, Yantai, China, 19-21 May, 2012.
  • Program Committee Member, ICASSP 2012, Kyoto, Japan, 25-30 March, 2012.
  • Program Committee Member, ICIST 2012, Wuhan, China, 23-25 March, 2012.
  • Program Committee Member, SENSORNETS 2012, Rome, Italy, 24-26 February, 2012.
  • Internal PhD Examiner, PhD Thesis: "Novel Tensor Factorization Based Approaches for Blind Source Separation", Department of Computing, University of Surrey, December 2011.
  • Session Chair, EUSIPCO 2011, "Multichannel Acoustic Processing I", Barcelona, Spain, 29 August -2 Sept, 2011.
  • Program Committee Member, SIP 2011, Dallas, USA, 14-16 December, 2011.
  • Program Committee Member, CISP 2011, Shanghai, China, 15-17 October, 2011.
  • Technical Committee Member, SSPD 2011, London, UK, 28-29 September, 2011.
  • Program Committee Member, UKCI 2011, Manchester, UK, September, 2011.
  • Special Session Co-Chair (with Jonathon Chambers and Bertrand Rivet), EUSIPCO 2011, "Multimodal (Audio-Visual) Speech Separation", Barcelona, Spain, 29 August -2 Sept, 2011.
  • Program Committee Member, BMVC 2011, Dundee, UK, 29 August -2 Sept, 2011.
  • Headstart Project Leader, School Outreach, Guildford, 17-20 July, 2011.
  • Program Committee Member, SIPA 2011, Crete, Greece, 22-24 June, 2011.
  • Program Committee Member, CSIE 2011, Changchun, China, 17-19 June, 2011.
  • Program Committee Member, ISNN 2011, Guilin, China, 29 May- 1 June, 2011.
  • Program Committee Member, ICIST 2011, Nanjing, China, 26-28 March, 2011.
  • Grant Reviewer, EPSRC, first grant proposal, March, 2011.
  • External PhD Examiner, School of Engineering, University of Edinburgh, 2010.
  • Grant Reviewer, PASCAL2, internal visiting proposal, August, 2010.
  • Headstart Project Leader, School Outreach, Guildford, 19-22 July, 2010.
  • Technical Committee Member, Conference on Sensor Signal Processing for Defence (SSPD 2010), London, UK, 29-30 September, 2010.
  • Program Committee Member, BMVC 2010, Aberystwyth, UK, 31 August - 3 September, 2010.
  • Program Committee Member, IEEE WCSE 2010, Wuhan, China, December 19-20, 2010.
  • Program Committee Member, IWACI 2010, Suzhou, China, August 25-27, 2010.
  • Program Committee Member, SIP 2010, Maui, Hawaii, USA, August 23-25, 2010.
  • Program Committee Member, SSSPR 2010, Cesme, Turkey, August 18-20, 2010.
  • Program Committee Member, ISNN 2010, Shanghai, China, June 6-9, 2010.
  • Publicity chair, IEEE International Workshop on Statistical Signal Processing (SSP 2009), Cardiff, UK, Aug. 31- Sept. 3, 2009.
  • Program Co-chair, IEEE Global Congress on Intelligent Systems (GCIS 2009), Xiamen, China, May 19-21, 2009.
  • Program Committee Member, SIP 2009, Honolulu, Hawaii, USA, August 17-19, 2009.
  • Program Committee Member, IEEE WCSE 2009, Xiamen, China, May 19-21, 2009.
  • Session Chair, ICA Research Network International Workshop (ICARN 2008), Liverpool, UK, September 25-26, 2008.
  • Chair, oral session "Unsupervised learning III", IEEE WCCI 2008, HongKong, China, June 1-6, 2008.
  • Guest editor, special issue "Advances in Nonnegative Matrix and Tensor Factorization", Computational Intelligence and Neuroscience(Hindawi), edited by A. Cichocki, M. Morup, P. Smaragdis, W. Wang, and R. Zdunek, May 2008.
  • Program Committee Member, SIP 2008, Kailua-Kona, Hawaii, USA, August 18-20, 2008.
  • Technical Committee Member, IEEE WCCI 2008, HongKong, China, June 1-6, 2008.

Invited Talks

  • W. Wang, "Machine Audition at CVSSP", in UK & IE Speech Conference, Birmingham, UK, December 17-18, 2012.
  • W. Wang, "Dictionary Learning Algorithms in Sparse Representations and Signal Processing," (Organizer: Dr Wei Liu),Department of Eletronic and Electrical Engineering , Sheffield University, October 24, 2012.
  • W. Wang, "Dictionary Learning Algorithms in Signal Processing," (Organizer: Dr Lu Gan), School of Engineering and Design, Brunel University, August 1, 2012.
  • W. Wang, "Adaptive Dictionary Learning Algorithms for Image Denoising, Source Separation, and Visual Tracking," (Organizer: Dr Andrew Aubrey), Cardiff School of Computer Science and Informatics, Cardiff University, May 24, 2012.
  • W. Wang, "Dictionary Learning Algorithms and Their Applications in Source separation, Speaker Tracking, and Image Denoising," (Organizer: Prof Mark Plumbley), School of Electronic Engineering and Computer Science, Queen Mary University of London, April 25, 2012.
  • W. Wang, "Audio and Audio-Visual Source Separation," (Organizer: Dr Xiaorong Shen), School of Automation Science and Electrical Engineering, Beihang University, Beijing, September 20, 2011.
  • T. Xu and W. Wang, "Compressive Sensing," (Organizer: Prof. Anthony Ho), Department of Computer Science, University of Surrey, Guildford, January 11, 2010.
  • W. Wang, "Multimodal Blind Source Separation for Robot Audition," (Organizer: Dr. Tania Stathaki), MOD University Defence Research Centre Launch & Theme Meeting, Imperial College London, London, November 5, 2009.
  • W. Wang, "Two-microphone Speech Separation Based on Convolutive ICA and Ideal Binary Mask Coupled with Cepstral Smoothing," (Organizer: Prof. Francis Rumsey), Institute of Sound Recording (IoSR), University of Surrey, Guildford, October 21, 2008.
  • W. Wang, "Convolutive ICA and NMF for Audio Source Separation and Perception," (Organizers: Prof. Vladimir M. Sloutsky & Prof. DeLiang Wang), Center for Cognitive Science, Ohio State University, Columbus, April 11, 2008.
  • W. Wang, "Audio Source Separation and Perception," (Organizer: Prof. DeLiang Wang), Perception and Neurodynamics Laboratory (PNL), Department of Computer Science and Engineering, Ohio State University, Columbus, March 07, 2008.
  • W. Wang, "Intelligent Data Fusion Based Blind Source Separation," (Organizer: Dr Nathan Wood), Royal Academy of Engineering, London, April 11, 2005.
  • W. Wang and J.A. Chambers, "Frequency Domain Blind Source Separation," IEE Seminar on Blind Source Separation in Biomedicine (Organizer: Dr. Christopher J. James), British Institute of Radiology, London, 1 Dec. 2004.
  • W. Wang, "Frequency Domain BSS and its Associated Permutation Problem," Contract Researchers Conference at Cardiff School of Engineering (Organizer: Dr. Adrian Porch), Cardiff University, Cardiff, July 16, 2004.
  • W. Wang, "Blind Signal Processing and Speech Enhancement," Series Forum for Celebration of the 50th Anniversary of Harbin Engineering University (Organizer: Prof. Yanling Hao), Harbin, Apr. 11, 2003.
  • W. Wang, S. Sanei, and J.A. Chambers, "Has the Permutation Problem in Transform Domain BSS Been Solved?," IEE Workshop on Independent Component Analysis: Generalizations, Algorithms and Applications (Organizer: Dr. Mike Davies), Queen Mary University of London, London, Dec. 20, 2002

Tutorial Speech 

  • W. Dai, W. Wang, and B. Mailhe, ICASSP 2013, "Dictionary Learning for Sparse Representations: Algorithms and Applications", Vancovar, Canada, May, 26-31, 2013.

Page Owner: ww0003
Page Created: Thursday 16 September 2010 14:47:13 by lb0014
Last Modified: Wednesday 13 March 2013 18:38:06 by ww0003
Expiry Date: Friday 16 December 2011 14:42:55
Assembly date: Tue Mar 26 22:38:41 GMT 2013
Content ID: 37289
Revision: 35
Community: 1379