MeMAD: Methods for managing audiovisual data - Combining automatic efficiency with human accuracy

Start date

2018

End date

2020

Project website

Overview

MeMAD is a Horizon 2020 research project which is led by Aalto University, Finland. The project will develop novel methods and models for managing and accessing digital audiovisual information in multiple languages and for various use contexts and audiences. This will be based on a combination of computer vision technologies, human input, and machine learning approaches to derive enhanced descriptions of audio-visual content. These descriptions will benefit creative industries, such as TV broadcasters and on-demand media service providers, as well as people using their services, by enabling them to access audiovisual information in novel ways.

As a partner in this project, the Centre for Translation Studies is responsible for Workpackage 5, Comparing Human and Machine-Generated Multimodal Content Description and Translation. Building on our expertise in investigating the semantic, pragmatic, and discursive foundations of human audio description as an instance of multimodal translation and our experience of developing multimodal corpora, we will analyse and compare human and machine-generated descriptions of audiovisual content.

Objectives

The main objectives are to identify characteristic features and patterns of each method and to re-model audio description, which was originally developed as an aid for visually impaired people, into a method of describing audiovisual content for diverse audiences. The ultimate aim is to contribute to a conceptual solution for machine-assisted video description.

The outcomes of this workpackage will facilitate story-telling and the re-use of content in the broadcasting context, and improve media access for visually impaired people and other diverse groups.