Researchers in Surrey’s CVSSP have been ranked in the top ten from 650 entries in the Google YouTube ‘8M Video Understanding Challenge’.
The competition, organised by Google on its Kaggle platform, aimed to accelerate research in large scale video understanding to improve the search and organisation of video archives, and was open to industry professionals, researcher and students from across the world.
Professor Miroslaw Bober and his ‘Yeti’ team (Mikel Bober-Irizar, Dr Eng-Jon Ong and Dr Sameed Husain) from CVSSP (Centre for Vision, Speech and Signal Processing) beat competition from major international corporations and global university groups to win a Gold Medal. CVSSP’s entry will now be presented at CVPR 2017, the world’s leading annual computer vision event, taking place in Honolulu, Hawaii from 21 to 26 July.
The challenge set by Google was to develop algorithms which accurately and automatically assign labels to videos using a dataset created from over seven million YouTube videos. Surrey’s researchers designed an AI (Artificial Intelligence) system which is able to learn to understand the story behind any video – just like humans do – and give a short verbal summary of what it is about. In order to do this, the system ‘watched’ almost half a million videos, with a total duration of 50 years.
Professor Bober explained: “There are hundreds of billions of images and videos out there, so it’s not possible to annotate them all. Thanks to this type of research, we can rapidly analyse large volumes of multimedia content such as video, images or sound tracks, which makes it possible to quickly find content of interest, without the need for laborious manual annotation.”
He added: “We are witnessing breath-taking advances in AI: it is already impacting our lives in areas from artificial vision to autonomous vehicles and genomic medicine. This Kaggle competition in ‘video understanding’ was an amazing adventure with AI and we are thrilled to have developed one of the ten gold medal-winning solutions.”
Professor Adrian Hilton, Head of CVSSP, said: “Achieving a top ranking in this highly prestigious competition has required a huge amount of effort and innovation from Professor Bober and his team over the past three months. This success, ahead of teams from major international corporations, reflects the world-leading contribution of the Centre’s research in visual recognition.”
The deep learning technologies CVSSP has developed over the past few years are being used by the BBC and the Police (where they have helped to capture criminals by analysing CCTV data). The same technologies can be applied to other domains where there is a need to analyse and understand large volumes of complex data – for example in the healthcare environment, to detect factors contributing to diseases and find the best treatments.
Why not explore our programmes in Electrical and Electronic Engineering, including our MSc Computer Vision, Robotics and Machine Learning?