Computer Vision for Automatic Storyboard Generation in Film and Tv Production

The goal of the PhD is to explore the relationship between text, image, and video media through machine learning to enable a computer to create storyboards and generate new unseen pictures simply from text written by a human. You will be supervised by experts in Machine learning, Dr Andrew Gilbert and Dr Armin Mustafa in CVSSP, in conjunction with an industrial partner BBC R&D, who will provide access to their vast dataset of relevant BBC archive programming. 

Start date
1 October 2021
Standard project duration is 4 years.
Application deadline
Funding source
The University of Surrey, Project-led Studentship Award.
Funding information

The funding package for this studentship award is as follows: 

  • Full tuition fee covered (UK, EU and International)
  • Stipend at £15,609 p.a. (2021/22)
  • Research Training Support Grant of £1,000 p.a.
  • A Personal Computer 


This project will explore using text to generate new images and scenes, converting sentences from a tv or film director into realistic image-based film storyboards and film backgrounds for virtual production using Artificial intelligence (AI). The research will aid the creative process in film and tv production and saving production teams time and money. Together with longer-term benefits of improving the accessibility of film and tv by generating automatic scene descriptions for partially sighted viewers described only by text. This project will be the ultimate test of spatial, visual, and semantic world knowledge used for automatic storyboard generation. 

Experts in Machine learning will supervise you, Dr Andrew Gilbert and Dr Armin Mustafa in CVSSP, in conjunction with an industrial partner BBC R&D, who will provide access to their vast dataset of relevant BBC archive programming. You’ll also be able to make extensive use of the machine learning facilities at the University of Surrey, including the sizeable AI@Surrey GPU cluster and other GPU servers.  

Prior knowledge in machine learning and computer vision is essential to apply for this fully funded position. Interest in generative or graph networks should be highlighted. We are looking for a student with strong mathematical and programming skills, willing to learn, hardworking and looking forward to working together in a team to solve these challenges. 

Related links
Department of Music and Media - Research courses Centre for Vision, Speech and Signal Processing (CVSSP) Digital Media Arts PhD

Eligibility criteria

Applicants are expected to hold a first-class or 2:1 honours degree (or equivalent overseas qualification) in an appropriate discipline (e.g. engineering, computer science, signal processing, applied mathematics, and physics).

Candidates should be able to demonstrate excellent mathematical, analytic, programming skills.

Previous experience in computer vision and machine/deep learning would be advantageous.

UK, EU  and international students are welcomed to apply.

Non-native speakers of English will be required to have IELTS 6.5 or above (or equivalent) with no sub-test of less than 6. 

How to apply

For informal enquiries, please contact Dr Andrew Gilbert in the first instance.

Applications should be made through our Digital Media Arts PhD programme page and should specify the point of contact as Dr Andrew Gilbert.  

You must also attach a CV, certified copies of degree certificates and transcripts, a personal statement describing relevant experience (maximum two pages), two references, and proof of eligibility (e.g. passport or residence permit). Shortlisted applicants will be contacted directly to arrange a suitable time for an interview. 

Application deadline

Contact details

Andrew Gilbert
04 BC 03
Telephone: +44 (0)1483 684713

Jointly between C-CATS (Centre for creative arts and technology) and CVSSP (Centre for vision speech and signal processing).


Studentships at Surrey

We have a wide range of studentship opportunities available.