Video and audio retrieval

CVSSP has established an internationally leading reputation in visual search, and the recognition of objects and activities within images and video. CVSSP regularly enters and ranks among the top contenders at international content based image/video retrieval (CBIR/CBVR) benchmarking exercises. Recent successes include winning the PASCAL visual object categorisation (VOC) challenge 2010.

Video and audio retrieval projects

iTrace: Large-scale visual search for plagiarism detection in the Arts

Web based services such as TurnItIn enable educational institutions to upload text documents, which are compared to web sources and previously uploaded coursework to detect plagiarism. iTrace seeks to provide a similar web based plagiarism checking service to the Visual Arts, using CVSSP visual search technology. iTrace is led by Dr John Collomosse and funded by JISC.

I-Dash: Investigators dashboard

I-Dash is a project within the Safer Internet Plus programme of the European Commision. It focuses on the development of automatic tools to support police professionals in their investigations which involve large quantities of child abuse video material. I-Dash project partners include Interpol, UK CEOP, and several European national police forces. The project was led by Dr Krystian Mikolajczyk.

Digital dance archives

The digital dance archives (DDA) serve as an online portal to digitised collections of dance performance and related imagery (e.g. rehearsal contact sheets).  DDA is a collaboration between Surrey and Coventry University, and resulted in a large cross-section of the UK National Resource Centre for Dance (NRCD) physical archive being digitised and placed online. The Siobhan Davies RePlay dance archive was also indexed. Visual search technology developed at CVSSP enables visual search of these dance collections, e.g. to find similar choreography and pose across large and diverse image collections in many cases digitised from aging media at low resolution, under challenging lighting conditions. The DDA project is led by Dr John Collomosse and funded by the AHRC.

CLARET II: Image classification and retrieval system

A collaborative research project between the University of Surrey and the BBC Research. The system is intended for assistance in annotation of the user generated content (UGC). UGC are images sent to BBC from the general public which are then annotated and displayed on BBC websites. CLARET is led by Dr Krystian Mikolajczyk and funded by the EPSRC.

Sketch based visual information retrieval

The Sketch-based visual information retrieval project explores the relationship between images and drawings, developing scalable techniques to search large photo and video collections using sketches.

The project is led by Dr John Collomosse and is funded by the EPSRC.


The project "Adaptive Cognition for Automated Sports Video Annotation" (ACASVA) addresses the challenging problem of autonomous cognition at the interface of vision and language. In particular we are focussing on the issues of transferring learning between different audio-visual domains. It is a joint research venture bringing together interdisciplinary scientific and engineering expertise at the Centre for Vision, Speech and Signal Processing (CVSSP) at the University of Surrey, the School of Biological and Chemical Sciences at Queen Mary, University of London and the School of Computing Sciences (CMP) at the University of East Anglia. This project is led by Dr. David Windridge and Prof. Josef Kittler, and is funded by EPSRC.


The goal of VIDI-video was to improve accessibility to large video archives. VIDI was an FP6 EU Framework Project, delivering innovation across all stages of the video retrieval pipeline from indexing and retrieval to browsing, presentation and relevance feedback. Partners included the University of Amsterdam, and INESC-ID. VIDI video was led by Prof Josef Kittler and Dr Krystian Mikolajczyk.


MUSCLE is an EC-sponsored Network of Excellence that aims at establishing and fostering closer collaboration between research groups in multimedia data-mining and machine learning. The Network integrates the expertise of over forty research groups working on image and video processing, speech and text analysis, statistics and machine learning. The goal is to explore the full potential of statistical learning and cross-modal interaction for the (semi-)automatic generation of robust meta-data with high semantic value for multimedia documents. Surrey is a member of the EU MUSCLE Network.

Contact us

Find us

Centre for Vision Speech and Signal Processing
Alan Turing Building (BB)
University of Surrey