2pm - 3pm

Tuesday 26 September 2023

Fine-grained video understanding and generation

CVSSP external seminar with Dr Yansong Tang.


21BA02 or via Teams
University of Surrey
back to all events

This event has passed

Teams Meeting ID: 381 426 627 230 
Teams Passcode: 74L42K 


The talk will address the problems of fine-grained video understanding and generation, which yield more new challenges compared with conventional scenarios and arise wide application value in sport, cooking, entertainment and so on.

The talk will start from our works on instructional video analysis, including the COIN dataset and a new condensed action space learning method for procedure planning in instructional videos. Second, I will introduce an uncertainty-aware score distribution learning method and a group-aware attention method for action quality assessment.

Finally, I will introduce how we leverage multimodal information (e.g., language and music) to enhance the performance of referring segmentation and dance generation.

Short biography

Yansong Tang is a tenure-track Assistant Professor of Shenzhen International Graduate School, Tsinghua University, China. Before that, he was a postdoctoral researcher of the Department of Engineering Science, University of Oxford.

He received his BS degree and PhD degree with honour at Tsinghua University. He has also spent time as a research assistant at Visual Computing Group of Microsoft Research Asia (MSRA), and VCLA lab of University of California, Los Angeles (UCLA). His research interests lie in computer vision.

Currently, he is working in the fields of video analytics and vision-language understanding, and he has published more than 30 scientific papers on top-tier journals and conferences including TPAMI, CVPR and ICCV. He serves as an area chair of IEEE International Conference on Automatic Face and Gesture Recognition in 2022.

Visitor information

Find out how to get to the University, make your way around campus and see what you can do when you get here.