
Yi Yuan
Academic and research departments
Centre for Vision, Speech and Signal Processing (CVSSP), Department of Electrical and Electronic Engineering.About
My research project
Deep learning based natural sound generationDeep learning for automated audio generation Designing immersive sound is of great interest, In such an application scenario, one aims to provide users with an immersive experience by delivering feels, sense, vision, and audition to the digital world. Due to the diversity in acoustic environments, such as the variety of individual physical events and complex acoustic scene, it is challenging to design the specific type of acoustic scene or events for the application tasks at hand. Therefore, there is an increasing demand for tools that enables automatic generation of sounds with tags or descriptions given by users. We aim to develop a new system for natural audio generation for the creation of video games and other entertainment. With these models, the input texts (or image/video) describing the scene will be translated to the corresponding audio clips such as dog barks, gunshots, wind blows and other kinds of natural or individual sounds.
Supervisors
Deep learning for automated audio generation Designing immersive sound is of great interest, In such an application scenario, one aims to provide users with an immersive experience by delivering feels, sense, vision, and audition to the digital world. Due to the diversity in acoustic environments, such as the variety of individual physical events and complex acoustic scene, it is challenging to design the specific type of acoustic scene or events for the application tasks at hand. Therefore, there is an increasing demand for tools that enables automatic generation of sounds with tags or descriptions given by users. We aim to develop a new system for natural audio generation for the creation of video games and other entertainment. With these models, the input texts (or image/video) describing the scene will be translated to the corresponding audio clips such as dog barks, gunshots, wind blows and other kinds of natural or individual sounds.