
Peng Zhang
About
My research project
Deep learning for audio-visual scene analysisFuture sound systems are expected to recommend “what” media content users may choose to consume, yet typical audio reproduction systems cannot readily adapt “how” they present the experience. This project aims to design advanced audio-visual signal processing algorithms to analyse the acoustic environment and user context, including sound events within the room (e.g. people talking with music playing in the background). Such information will be used to inform the spatial audio system for sound reproduction, with awareness of the user's intentions, choices and environment. The engineering challenges are to provide users with the controls they would like to have in order to adjust the behaviour of their audio system, for example, boosting the bass response interacting with the room to provide greater envelopment or utilising the spatial extent of an immersive sound reproduction system to render the content more intelligibly.
Supervisors
Future sound systems are expected to recommend “what” media content users may choose to consume, yet typical audio reproduction systems cannot readily adapt “how” they present the experience. This project aims to design advanced audio-visual signal processing algorithms to analyse the acoustic environment and user context, including sound events within the room (e.g. people talking with music playing in the background). Such information will be used to inform the spatial audio system for sound reproduction, with awareness of the user's intentions, choices and environment. The engineering challenges are to provide users with the controls they would like to have in order to adjust the behaviour of their audio system, for example, boosting the bass response interacting with the room to provide greater envelopment or utilising the spatial extent of an immersive sound reproduction system to render the content more intelligibly.