Dr Chris Hummersone
Lecturer (IoSR), Sound Recording (Tonmeister) Admissions Officer
Qualifications: BMus (Tonmeister), PhD (Surrey), MAES, MIEEE, AHEA
Email: christopher.hummersone@surrey.ac.uk
Phone: Work: 01483 68 6167
Room no: 07 BC 03
Further information
Biography
I graduated from the Tonmeister course in June 2007 and joined the IoSR as a research student in October 2007. I completed my thesis, entitled "A Psychoacoustic Engineering Approach to Machine Sound Source Separation in Reverberant Environments", in September 2010 and joined the IoSR as a lecturer in October 2010. In my spare time I enjoy playing the saxophone, cycling and running, having completed the London Marathon in 2007 and 2012.
Research Interests
My research interests include modelling the precedence effect and binaural localisation, computational auditory scene analysis, and machine listening for the automated evaluation of audio quality.
Research Collaborations
A psychoacoustic engineering approach to machine sound source separation in reverberant environmentsPublications
Journal articles
- .
(2011) 'Ideal Binary Mask Ratio: a novel metric for assessing binary-mask-based sound source separation algorithms'. IEEE IEEE Transactions on Audio, Speech and Language Processing, 19 (7), pp. 2039-2045.Full text is available at: http://epubs.surrey.ac.uk/7195/
Abstract
A number of metrics has been proposed in the literature to assess sound source separation algorithms. The addition of convolutional distortion raises further questions about the assessment of source separation algorithms in reverberant conditions as reverberation is shown to undermine the optimality of the ideal binary mask (IBM) in terms of signal-to-noise ratio (SNR). Furthermore, with a range of mixture parameters common across numerous acoustic conditions, SNR–based metrics demonstrate an inconsistency that can only be attributed to the convolutional distortion. This suggests the necessity for an alternate metric in the presence of convolutional distortion, such as reverberation. Consequently, a novel metric—dubbed the IBM ratio (IBMR)—is proposed for assessing source separation algorithms that aim to calculate the IBM. The metric is robust to many of the effects of convolutional distortion on the output of the system and may provide a more representative insight into the performance of a given algorithm.
- .
(2010) 'Dynamic precedence effect modeling for source separation in reverberant environments'. IEEE Transactions on Audio, Speech and Language Processing, 18 (7), pp. 1867-1871.Full text is available at: http://epubs.surrey.ac.uk/2935/
Abstract
Reverberation continues to present a major problem for sound source separation algorithms. However, humans demonstrate a remarkable robustness to reverberation and many psychophysical and perceptual mechanisms are well documented. The precedence effect is one of these mechanisms; it aids our ability to localize sounds in reverberation. Despite this, relatively little work has been done on incorporating the precedence effect into automated source separation. Furthermore, no work has been carried out on adapting a precedence model to the acoustic conditions under test and it is unclear whether such adaptation, analogous to the perceptual Clifton effect, is even necessary. Hence, this study tests a previously proposed binaural separation/precedence model in real rooms with a range of reverberant conditions. The precedence model inhibitory time constant and inhibitory gain are varied in each room in order to establish the necessity for adaptation to the acoustic conditions. The paper concludes that adaptation is necessary and can yield significant gains in separation performance. Furthermore, it is shown that the initial time delay gap and the direct-to-reverberant ratio are important factors when considering this adaptation. © 2010 IEEE.
Conference papers
- .
(2010) 'A comparison of computational precedence models for source separation in reverberant environments'. Audio Engineering Society Audio Engineering Society Preprint, London, UK: 128th Audio Engineering Society Convention 7981Full text is available at: http://epubs.surrey.ac.uk/2936/
- .
(2007) 'Potential biases in MUSHRA listening tests'. Audio Engineering Society Audio Engineering Society Preprint, New York: 123rd Audio Engineering Society Convention 7179Full text is available at: http://epubs.surrey.ac.uk/7252/
Abstract
The method described in the ITU-R BS.1534-1 standard, commonly known as MUSHRA (MUltiple Stimulus with Hidden Reference and Anchors), is widely used for the evaluation of systems exhibiting intermediate quality levels, in particular low-bit rate codecs. This paper demonstrates that this method, despite its popularity, is not immune to biases. In two different experiments designed to investigate potential biases in the MUSHRA test, systematic discrepancies in the results were observed with a magnitude up to 22%. The data indicates that these discrepancies could be attributed to the stimulus spacing and range equalizing biases.
Posters
- .
(2010) Machine Listening for Sound Quality Evaluation. Machine Listening Workshop 2010, Queen Mary University of LondonFull text is available at: http://epubs.surrey.ac.uk/2925/
- .
(2010) A perceptually–inspired approach to machine sound source separation in real rooms. University of Surrey Postgraduate Research ConferenceFull text is available at: http://epubs.surrey.ac.uk/7250/
Theses and dissertations
- .
(2011) A Psychoacoustic Engineering Approach to Machine Sound Source Separation in Reverberant Environments. Full text is available at: http://epubs.surrey.ac.uk/2923/
Abstract
Reverberation continues to present a major problem for sound source separation algorithms, due to its corruption of many of the acoustical cues on which these algorithms rely. However, humans demonstrate a remarkable robustness to reverberation and many psychophysical and perceptual mechanisms are well documented. This thesis therefore considers the research question: can the reverberation–performance of existing psychoacoustic engineering approaches to machine source separation be improved? The precedence effect is a perceptual mechanism that aids our ability to localise sounds in reverberant environments. Despite this, relatively little work has been done on incorporating the precedence effect into automated sound source separation. Consequently, a study was conducted that compared several computational precedence models and their impact on the performance of a baseline separation algorithm. The algorithm included a precedence model, which was replaced with the other precedence models during the investigation. The models were tested using a novel metric in a range of reverberant rooms and with a range of other mixture parameters. The metric, termed Ideal Binary Mask Ratio, is shown to be robust to the effects of reverberation and facilitates meaningful and direct comparison between algorithms across different acoustic conditions. Large differences between the performances of the models were observed. The results showed that a separation algorithm incorporating a model based on interaural coherence produces the greatest performance gain over the baseline algorithm. The results from the study also indicated that it may be necessary to adapt the precedence model to the acoustic conditions in which the model is utilised. This effect is analogous to the perceptual Clifton effect, which is a dynamic component of the precedence effect that appears to adapt precedence to a given acoustic environment in order to maximise its effectiveness. However, no work has been carried out on adapting a precedence model to the acoustic conditions under test. Specifically, although the necessity for such a component has been suggested in the literature, neither its necessity nor benefit has been formally validated. Consequently, a further study was conducted in which parameters of each of the previously compared precedence models were varied in each room in order to identify if, and to what extent, the separation performance varied with these parameters. The results showed tha
Teaching
My teaching duties include:
- HE1 Audio Engineering & Recording Techniques A/B (Technical Ear Training)
- HE3 Video Engineering
- HE3 Technical Project (Audio Research Seminars)
Book a tutorial (IoSR members only)
Departmental Duties
I am currently the Assistant Director of Research at the IoSR and Admissions Officer for the Tonmeister programme.
Downloads
Publications
Some of my publications, including my thesis, are available to download from the Surrey Research Insights website.
MUSHRA Max/MSP patcher (version 2.0)
This Max/MSP patcher is designed for conducting MUSHRA (MUltiple Stimulus with Hidden Reference and Anchor, see ITU-R BS.1534) listening tests. The patch allows the comparison of a number of stimuli (7 by default, although up to 10 is supported) and facilitates repeats. By default, the patch can compare seven processes applied to three music tracks, with two repeats, making a total of six pages. However, the patch can be easily adapted to accommodate a different number of stimuli and/or music tracks and/or repeats; the included documentation describes the procedure in detail. Version 2.0 will work with Max/MSP 4 or higher.
MUSHRA_patcher (1238.34KB)ABX Max/MSP patcher
This Max/MSP patcher (compatible with Max/MSP 4.6 or higher) implements an ABX listening test. The test is a discrimination task, assessing the listener's ability to hear differences in audio files that contain small impairments. The patcher chooses a reference stimulus and a test stimulus, which are randomly assigned to A and B. Either A or B is then randomly assigned to X. This listener must decide whether X is A or B. The patcher chooses the test stimulus from a pool of several audio files. The presentation of each audio file can be repeated a specified number of times. The presentation of all audio files and repeats is randomised.
ABX patcher (79.09KB)A Localisation- and Precedence-based Binaural Separation Algorithm (version 1.0)
The separation software developed during my PhD is available below. It is based upon Palomäki et al.'s (2004) binaural processor for missing data speech recognition in the presence of noise and small-room reverberation. The software generates "cocktail party" mixtures of signals arising from two spatially–separate sound sources in real rooms and attempts to separate them using interaural cues enhanced by models of the precedence effect. Implemented precedence models include those proposed by Martin (1997), Faller & Merimaa (2004), Lindemann (1986) and Macpherson (1991). The software is written for Matlab and should work on most platforms and versions. It requires the signal processing toolbox and a compatible C compiler.
PrecSep_toolbox (3987.83KB)Binaural Room Impulse Responses Captured in Real Rooms
The Binaural Room Impulse Responses (BRIRs) used in the above software are packaged with it, but available separately and at a higher sampling frequency below. The responses were captured in real rooms at the university, with sound sources placed on the frontal azimuthal plane (±90°) in 5° increments. The package includes documentation on how and where the responses were captured. The BRIRs are stereo wave files, captured at 48 kHz, 16 bit, but also included down-sampled to 16 kHz.
BRIRs (13943.27KB)Simulated Room Impulse Responses
This archive contains three groups of 11 sets of RIRs obtained from a room simulated in CATT-Acoustics modelling software. Each set has a different reverberation time that was varied by changing the absorption coefficient of all six surfaces to produce reverberation times in the interval [0,1] s. The room was shoebox-shaped with dimensions 6×4×3 m (l×w×h). The impulse responses were calculated with the receiver located in the centre of the room at a height of 2 m and the source at a distance of 1.5 m. The omnidirectional sound source was placed at head height on the frontal azimuthal plane (±90°) in 5° increments.
The sets are provided in three groups, with each group corresponding to a different receiver configuration. The three receiver configurations were: binaural, spaced omnidirectional, and mono omnidirectional. In the spaced omnidirectional case, the receivers were spaced by 16 cm – the approximate spacing of the ears in the binaural case. The mono omnidirectional case is provided for completeness.
The RIRs are mono or stereo wave files, simulated at 44.1 kHz, 16 bit, but also included down-sampled to 16 kHz.
CATT-Acoustics Room Impulse Responses (69046.11KB)Matlab Files
My Matlab community profile contains a selection of small functions that have arisen during the course my research.




