10:30am - 11:15am
Tuesday 29 January 2019
Deep learning based speech signal processing
Speech is still one of the most convenient ways of designing natural human-computer-interaction systems, partly because of the convenience of sensors such as microphones.
This event has passed
- Professor Hong-Goo Kang
Speech is still one of the most convenient ways of designing natural human-computer-interaction systems, partly because of the convenience of sensors such as microphones. Deep learning methods, which have brought about paradigm shifts in many research fields, now also play a key role in natural speech interface areas such as speech enhancement, automatic speech/speaker recognition, and text-to-speech.
In this talk, we introduce the recent research activities of the DSP & AI Lab in Yonsei University. We present brief descriptions of three projects:
- Background noise removal for audio/video clips
- Emotional text-to-speech systems
- Audio-visual signal processing applications.
All of the projects described in this talk require in-depth knowledge of deep learning techniques as well as speech signal processing theory.
Hong-Goo Kang received the B.S., M.S., and Ph.D. degrees from Yonsei University, Korea in 1989, 1991, and 1995, respectively, and is currently a professor at Yonsei University. From 1996 to 2002, he was a senior technical staff member at AT&T Labs-Research, Florham Park, New Jersey.
He has actively participated in international collaborative efforts on developing new speech/audio coding standard algorithms hosted by ITU-T and MPEG. From 2005 to 2008, he was an associate editor of the IEEE Transactions on Audio, Speech, and Language Processing. He has also served on numerous conference and program committees, and was the vice-chair of the technical program committee during INTERSPEECH 2004, held in Jeju Island, Korea.
From 2008~2009 and 2015~2016, respectively, he worked at Broadcom (Irvine, CA) and Google (Mountain View, CA) as a visiting scholar, where he participated in various projects on speech signal processing. His research interests include speech/audio signal processing, machine learning, and human computer interfaces.