2pm - 3pm
Tuesday 26 September 2023
Fine-grained video understanding and generation
CVSSP external seminar with Dr Yansong Tang.
University of Surrey
This event has passed
Teams Meeting ID: 381 426 627 230
Teams Passcode: 74L42K
The talk will address the problems of fine-grained video understanding and generation, which yield more new challenges compared with conventional scenarios and arise wide application value in sport, cooking, entertainment and so on.
The talk will start from our works on instructional video analysis, including the COIN dataset and a new condensed action space learning method for procedure planning in instructional videos. Second, I will introduce an uncertainty-aware score distribution learning method and a group-aware attention method for action quality assessment.
Finally, I will introduce how we leverage multimodal information (e.g., language and music) to enhance the performance of referring segmentation and dance generation.
Yansong Tang is a tenure-track Assistant Professor of Shenzhen International Graduate School, Tsinghua University, China. Before that, he was a postdoctoral researcher of the Department of Engineering Science, University of Oxford.
He received his BS degree and PhD degree with honour at Tsinghua University. He has also spent time as a research assistant at Visual Computing Group of Microsoft Research Asia (MSRA), and VCLA lab of University of California, Los Angeles (UCLA). His research interests lie in computer vision.
Currently, he is working in the fields of video analytics and vision-language understanding, and he has published more than 30 scientific papers on top-tier journals and conferences including TPAMI, CVPR and ICCV. He serves as an area chair of IEEE International Conference on Automatic Face and Gesture Recognition in 2022.