Creative and entertainment
One of the ways AI improves human wellbeing is through life-enhancing entertainment technologies in the fields of audio (including speech and music), video (film, TV and immersive experiences), and other creative applications.
AI for creative and entertainment
The last 20 years have seen a move from manual to automated digital processes for the creative industries, kick-starting a boom in virtual reality (VR), augmented reality (AR) and 3D spatial audio, and Surrey’s Centre for Vision, Speech and Signal Processing (CVSSP) is at the forefront of research in all three fields.
The Centre has pioneered visual effects which are used to bring stories to life not only in sci-fi blockbusters but also for period dramas, live action sports broadcasts and many other types of entertainment content. It first developed the concept of 3D video capture in the mid-1990s, which opened the door to highly realistic animations based on the real movements of people and animals using AI and machine learning, and it has continued to push the boundaries of these techniques.
More recently, CVSSP has developed the concept of ‘4D vision’ which uses a combination of multi-camera capture systems and advanced algorithms to model complex scenes in real time. This is enabling autonomous systems not only for entertainment but also for healthcare, assisted living, animal welfare and security applications.
A step-change in home audio
S3A spatial audio will for the first time, give consumers the sense of ‘being there’ at a live event such as a concert or football match from the comfort of their living room without the need for specialist equipment or a complex speaker set-up.
In the field of audio, CVSSP is investigating ‘machine listening’ algorithms that manipulate signals for speech and audio applications, with the ultimate aim of optimising the way audio content is delivered. This includes creating auditory perception computer models which can measure, control and optimise audio to automatically adapt to the listener; enhancing the audio description of images and TV programmes for visually impaired people, and separating audio sources (such as ‘cocktail party’ type speech).
As part of the EPSRC S3A Spatial Audio programme, a major collaboration with the BBC and the Universities of Salford and Southampton, CVSSP is working on enabling a fully immersive at-home listener experience based on spatial audio techniques.