Institute of Sound Recording (IoSR)
The Institute of Sound Recording (IoSR) is responsible for world-class research in psychoacoustic engineering and offers postgraduate research-based MPhil and PhD programmes in this area, as well as being home to the world-famous Tonmeister® BMus undergraduate degree course in Music & Sound Recording.
The IoSR has strong links with a wide range of companies in the audio industry, through both research and the Tonmeister course, including Abbey Road Studios, AMS-Neve, Avid, Bang & Olufsen, BBC Research and Development, Bowers and Wilkins, Dolby, Focusrite, Genelec, Real World Studios, and Solid State Logic, among many others.
The IoSR also benefits from a range of professional facilities of the highest standards, including 3 recording studios (Studio-1 / Studio-2 / Studio-3), 3 edit rooms, over 100 microphones, and an ITU-R BS 1116 standard listening room.
Since the creation of the Institute of Sound Recording (IoSR) in 1998 it has become known internationally as a leading centre for research in psychoacoustic engineering, with world-class facilities and with significant funding from research councils (in particular EPSRC) and from industry (we have successfully completed projects in collaboration with Adrian James Acoustics, Bang & Olufsen, BBC R&D, Genelec, Harman-Becker, Institut für Rundfunktechnik, Meridian Audio, Nokia, Pharos Communications and Sony BPE). Additionally, the IoSR was a founding partner in the EPSRC-funded Digital Music Research Network (DMRN) and Spatial Audio Creative Engineering Network (SpACE-Net).
We are interested in human perception of audio quality, primarily of high-fidelity music signals. Overall perceived quality depends, at least in part, on perception of lower-level timbral and spatial attributes such as brightness, warmth, locatedness and envelopment. These attributes depend, in turn, on acoustic parameters such as frequency spectrum and inter-aural cross-correlation coefficient. Using a combination of acoustic measurement and human listening tests we are exploring the connections between acoustic parameters and perceived timbral and spatial attributes, and also between these perceptual attributes and overall quality and listener preference. From our findings we are developing mathematical and computational models of human auditory perception, and engineering perceptually-motivated audio tools.
Our work combines elements of acoustics, digital signal processing, psychoacoustics (theoretical and experimental), psychology, sound synthesis, software engineering, statistical analysis and user-interface design, with an understanding of the aesthetics of sound and music.
Applications & Outputs
One particular focus of our work is the development of tools to predict the perceived audio quality of a given soundfield or audio signal. If, for example, a new concert hall, hi-fi or audio codec is being designed, it is important to know how each candidate prototype would be rated by human listeners and how it would compare to other products which may be in competition. Traditional acoustic and electronic measurements (e.g. RT60, SNR, THD) can give some indication but a truly representative assessment requires lengthy listening tests with a panel of skilled human listeners. Such tests are time-consuming, costly and often logistically difficult. The tools that we are developing will describe the quality of the prototype without the need for human listeners.
Similarly, to monitor the quality of broadcast audio, and perhaps make decisions about appropriate allocation of finite bandwidth; or to assess the quality of a recording as it is being made, and make appropriate adjustments to balance, microphones or processing; requires accurate monitoring in a good listening environment and the undivided attention of a skilled engineer. If the monitoring is poor, the listening environment is compromised, or the engineer is multi-tasking or tired, then our audio quality prediction tools may be invaluable.
Complementary strands of our work deal with the development of quality advisors, which can advise on the most appropriate recording or coding techniques in advance of the source audio being available; of perceptually-motivated signal processing systems, which allow direct control over the timbral and spatial attributes of sound; and of listener training systems, which can strengthen and maintain the skills of human listeners wanting, or required, to critically assess audio.
IoSR research projects
Find out more about some of the research projects conducted in the IoSR since 2000.
The software and other digital resources below were created as part of the research undertaken at the Institute of Sound Recording.
- MUSHRA test GUI
- ABX test GUI
- A localisation- and precedence-based binaural separation algorithm
- Perceptually motivated measurement of spatial sound attributes
- Matlab functions
MUSHRA test GUI
This Max/MSP patcher is designed for conducting MUSHRA (MUltiple Stimulus with Hidden Reference and Anchor, see ITU-R BS.1534) listening tests. The patcher allows the comparison of a number of stimuli (7 by default, although up to 10 is supported) and facilitates repeats. By default, the patcher can compare seven processes applied to three music tracks, with two repeats, making a total of six pages. However, the software can be easily adapted to accommodate a different number of stimuli and/or music tracks and/or repeats; the included documentation describes the procedure in detail.
Requirements: Max/MSP 6 or higher.
ABX test GUI
This Max/MSP patcher implements an ABX listening test. The test is a discrimination task, assessing the listener's ability to hear differences in audio files that contain small impairments. The patcher chooses a reference stimulus and a test stimulus, which are randomly assigned to A and B. Either A or B is then randomly assigned to X. This listener must decide whether X is A or B. The patcher chooses the test stimulus from a pool of several audio files. The presentation of each audio file can be repeated a specified number of times. The presentation of all audio files and repeats is randomised.
Requirements: Max/MSP 6 or higher.
A localisation- and precedence-based binaural separation algorithm
The separation software developed during Chris Hummersone's PhD is available below. It is based upon Palomäki et al.'s (2004) binaural processor for missing data speech recognition in the presence of noise and small-room reverberation. The software generates "cocktail party" mixtures of signals arising from two spatially-separate sound sources in real rooms and attempts to separate them using interaural cues enhanced by models of the precedence effect. Implemented precedence models include those proposed by Martin (1997), Faller & Merimaa (2004), Lindemann (1986) and Macpherson (1991).
Requirements: Matlab, Signal Processing Toolbox, compatible C compiler (for Mex functions).
Perceptually motivated measurement of spatial sound attributes
This software will analyse a binaurally-recorded .wav file, and display predictions of the perceived angular width and direction of the sound as they vary over time.
Requirements: Matlab, Signal Processing Toolbox.
The IoSR Matlab Toolbox contains a number of functions and classes for: auditory modelling, signal processing, sound source separation, statistics, plotting, etc.
Other Digital Resources
Binaural Room Impulse Responses Captured in Real Rooms
The binaural room impulse responses (BRIRs) used in A localisation- and precedence-based binaural separation algorithm are packaged with it, but are available separately and at a higher sampling frequency here. The responses were captured in real rooms at the University, with sound sources placed on the frontal azimuthal plane (±90°) in 5° increments. The package includes documentation on how and where the responses were captured. The BRIRs are included as stereo wave files and spatially-oriented format for acoustics (SOFA) files. They were captured at 48 kHz, 16 bit, but are also included downsampled to 16 kHz.
Simulated Room Impulse Responses
This archive contains three groups of 11 sets of RIRs obtained from a room simulated in CATT-Acoustics modelling software. Each set has a different reverberation time that was varied by changing the absorption coefficient of all six surfaces to produce reverberation times in the interval [0,1] s. The room was shoebox-shaped with dimensions 6×4×3 m (l×w×h). The impulse responses were calculated with the receiver located in the centre of the room at a height of 2 m and the source at a distance of 1.5 m. The omnidirectional sound source was placed at head height on the frontal azimuthal plane (±90°) in 5° increments.
The sets are provided in three groups, with each group corresponding to a different receiver configuration. The three receiver configurations were: binaural, spaced omnidirectional, and mono omnidirectional. In the spaced omnidirectional case, the receivers were spaced by 16 cm (the approximate spacing of the ears in the binaural case). The mono omnidirectional case is provided for completeness.
The RIRs are mono or stereo wave files, simulated at 44.1 kHz, 16 bit, but also included down-sampled to 16 kHz.
Our research aims to provide tools to assist in any area where assessment of the quality of audio as perceived by human listeners (either overall or in terms of specific timbral or spatial attributes) is desirable but, for one reason or another, potentially problematic; and to provide complementary tools to facilitate appropriate adjustment where the assessed quality is not as it should be. More succinctly, and more generally, we aim to engineer perceptually-motivated signal analysis, processing and control systems. If we have a single over-arching goal then it is simply this: to make sound better.
View our members only website.