Sobhan Asasi

Postgraduate Research Student

PhD

s.asasi@surrey.ac.uk

Personal Website

Academic and research departments

Centre for Vision, Speech and Signal Processing (CVSSP).

About

My research project

Large lexicon sign language recognition and translation

I am currently working on leveraging video LLMs' abilities to improve sign language understanding.

Supervisors

Richard Bowden

Mohamed Lakhal

Özge Mercanoğlu Sincan

Publications

Sobhan Asasi, Mohamed Ilyes Lakhal, Richard Bowden (2025) Hierarchical Feature Alignment for Gloss-Free Sign Language Translation

Sign Language Translation (SLT) attempts to convert sign language videos into spoken sentences. However, many existing methods struggle with the disparity between visual and textual representations during end-to-end learning. Gloss-based approaches help to bridge this gap by leveraging structured linguistic information. While, gloss-free methods offer greater flexibility and remove the burden of annotation, they require effective alignment strategies. Recent advances in Large Language Models (LLMs) have enabled gloss-free SLT by generating text-like representations from sign videos. In this work, we introduce a novel hierarchical pre-training strategy inspired by the structure of sign language, incorporating pseudo-glosses and contrastive video-language alignment. Our method hierarchically extracts features at frame, segment, and video levels, aligning them with pseudo-glosses and the spoken sentence to enhance translation quality. Experiments demonstrate that our approach improves BLEU-4 and ROUGE scores while maintaining efficiency.