Acoustic Source Separation in Real Time

A real-time solution to the ‘cocktail party’ problem and sound localisation.

Summary

Real-time separation and localisation of simultaneously active sound sources. Sound sources can be isolated for individual reproduction or suppression in the presence of noise and other interfering sounds. This technology can be used in practically all applications where audio recordings free from interference are needed.

Main Features

  • Small, lightweight microphone array.
  • Real-time separation with less than 1/40 s delay.
  • 14.4 dB SNR improvement on average, which may be as high as 26 dB on some occasions.
  • Small number of audio channels; advantageous in interfacing, storage and processing.
  • Automatic localisation and separation of multiple sources in horizontal and vertical axes.
  • Tracking and isolating a moving sound source.
  • 3D symmetry in operation; no preferred direction of operation.
  • Broadband operation, i.e., in full audio range
  • High quality binaural spatial reproduction of the separated sounds.
  • User interaction for real-time parameter changing for fine adjustment of sound quality.

Potential Applications

  • Surveillance and security CCTV systems that automatically point to sounds and listen, carry out automatic keyword/threat detection in noisy environments such as airports
  • Mobile phones Environmental noise, interference suppression on mobile devices
  • Automotive Improved voice capturing and recognition in cars
  • Immersive, remote collaboration, hands-free teleconferencing Selective transmission of multiple speech sounds and their processing for 3D reproduction including acoustic echo cancellation
  • Content production, broadcasting Flexibility in audio object editing for spatial synchronicity of reproduced audio and video
  • Robotics Robots that turn towards sound sources and respond to commands coming from a specific direction
  • Human-computer interfaces Pre-processing to improve signal-to-noise ratio for speech recognition
  • Biometric identification Pre-processing to improve signal-to-noise ratio for speaker identification
  • Hearing-aids, enhanced hearing Listening-to selected sounds/conversations at a cocktail-party

Back to Licensing Opportunities

For more information, please contact:

Martyn Buxton-Hoare,
Assistant Director - Technology Transfer
T +44(0)1483 683 670
M +44(0)7714 604 823
m.buxton-hoare@surrey.ac.uk

 

Development status: demonstration available

Availability: available for licensing.

IP status: Patent filed by University of Surrey

Technical: Block-based processing enables fast, near real-time implementation even at high sampling rates such as 44.1 kHz. Recording can use a compact microphone array whose size can be as small as a ping-pong ball or less. Performance tests in semi-reverberant and reverberant environments with two or three sound sources provide good signal-to-interference (SIR) ratios (Fig 2).