Muhammad Awais Centre for Vision Speech and Signal Processing CVSSP, Surrey Institute for people-centred AI; self-supervised learning; deep learning; machine learning; foundation models; multimodal learning and analysis

Dr Muhammad Awais

Senior Lecturer in Trustworthy and Responsible AI. Leaading the research on foundation models and self-supervised learning
PhD in AI, MSc in AI, BSc Computer Engineering, Bsc in Math and Physics



Research interests


SYED SAFWAN KHALID, MUHAMMAD AWAIS TANVIR RANA, ZHENHUA FENG, CHI HO CHAN, AMMARAH FAROOQ, ALI AKBARI, JOSEF VACLAV KITTLER (2022)NPT-Loss: Demystifying face recognition losses with Nearest Proxies Triplet, In: IEEE transactions on pattern analysis and machine intelligence IEEE

Face recognition (FR) using deep convolutional neural networks (DCNNs) has seen remarkable success in recent years. One key ingredient of DCNN-based FR is the design of a loss function that ensures discrimination between various identities. The state-of-the-art (SOTA) solutions utilise normalised Softmax loss with additive and/or multiplicative margins. Despite being popular and effective, these losses are justified only intuitively with little theoretical explanations. In this work, we show that under the LogSumExp (LSE) approximation, the SOTA Softmax losses become equivalent to a proxy-triplet loss that focuses on nearest-neighbour negative proxies only. This motivates us to propose a variant of the proxy-triplet loss, entitled Nearest Proxies Triplet (NPT) loss, which unlike SOTA solutions, converges for a wider range of hyper-parameters and offers flexibility in proxy selection and thus outperforms SOTA techniques. We generalise many SOTA losses into a single framework and give theoretical justifications for the assertion that minimising the proposed loss ensures a minimum separability between all identities. We also show that the proposed loss has an implicit mechanism of hard-sample mining. We conduct extensive experiments using various DCNN architectures on a number of FR benchmarks to demonstrate the efficacy of the proposed scheme over SOTA methods.

Lei Ju, Josef Vaclav Kittler, Muhammad Awais Tanvir Rana, Wankou Yang, Zhenhua Feng (2023)Keep an eye on faces: Robust face detection with heatmap-Assisted spatial attention and scale-Aware layer attention, In: Pattern recognition140 Elsevier Ltd

We propose supervised spatial attention that employs a heatmap generator for instructive feature learning.•We formulate a rectified Gaussian scoring function to generate informative heatmaps.•We present scale-aware layer attention that eliminates redundant information from pyramid features.•A voting strategy is designed to produce more reliable classification results.•Our face detector achieves encouraging performance in accuracy and speed on several benchmarks. Modern anchor-based face detectors learn discriminative features using large-capacity networks and extensive anchor settings. In spite of their promising results, they are not without problems. First, most anchors extract redundant features from the background. As a consequence, the performance improvements are achieved at the expense of a disproportionate computational complexity. Second, the predicted face boxes are only distinguished by a classifier supervised by pre-defined positive, negative and ignored anchors. This strategy may ignore potential contributions from cohorts of anchors labelled negative/ignored during inference simply because of their inferior initialisation, although they can regress well to a target. In other words, true positives and representative features may get filtered out by unreliable confidence scores. To deal with the first concern and achieve more efficient face detection, we propose a Heatmap-assisted Spatial Attention (HSA) module and a Scale-aware Layer Attention (SLA) module to extract informative features using lower computational costs. To be specific, SLA incorporates the information from all the feature pyramid layers, weighted adaptively to remove redundant layers. HSA predicts a reshaped Gaussian heatmap and employs it to facilitate a spatial feature selection by better highlighting facial areas. For more reliable decision-making, we merge the predicted heatmap scores and classification results by voting. Since our heatmap scores are based on the distance to the face centres, they are able to retain all the well-regressed anchors. The experiments obtained on several well-known benchmarks demonstrate the merits of the proposed method.

ALI AKBARI, MUHAMMAD AWAIS TANVIR RANA, SOROUSH FATEMIFAR, SYED SAFWAN KHALID, JOSEF VACLAV KITTLER (2021)A Novel Ground Metric for Optimal Transport based Chronological Age Estimation, In: IEEE Transactions on Cybernetics IEEE

—Label distribution Learning (LDL) is the state-of-the-art approach to deal with a number of real-world applications , such as chronological age estimation from a face image, where there is an inherent similarity among adjacent age labels. LDL takes into account the semantic similarity by assigning a label distribution to each instance. The well-known Kullback–Leibler (KL) divergence is the widely used loss function for the LDL framework. However, the KL divergence does not fully and effectively capture the semantic similarity among age labels, thus leading to the sub-optimal performance. In this paper, we propose a novel loss function based on optimal transport theory for the LDL-based age estimation. A ground metric function plays an important role in the optimal transport formulation. It should be carefully determined based on underlying geometric structure of the label space of the application in-hand. The label space in the age estimation problem has a specific geometric structure, i.e. closer ages have more inherent semantic relationship. Inspired by this, we devise a novel ground metric function, which enables the loss function to increase the influence of highly correlated ages; thus exploiting the semantic similarity among ages more effectively than the existing loss functions. We then use the proposed loss function, namely γ–Wasserstein loss, for training a deep neural network (DNN). This leads to a notoriously computationally expensive and non-convex optimisa-tion problem. Following the standard methodology, we formulate the optimisation function as a convex problem and then use an efficient iterative algorithm to update the parameters of the DNN. Extensive experiments in age estimation on different benchmark datasets validate the effectiveness of the proposed method, which consistently outperforms state-of-the-art approaches.

Recently, impressively growing efforts have been devoted to the challenging task of facial age estimation. The improvements in performance achieved by new algorithms are measured on several benchmarking test databases with different characteristics to check on consistency. While this is a valuable methodology in itself, a significant issue in the most age estimation related studies is that the reported results lack an assessment of intrinsic system uncertainty. Hence, a more in-depth view is required to examine the robustness of age estimation systems in different scenarios. The purpose of this paper is to conduct an evaluative and comparative analysis of different age estimation systems to identify trends, as well as the points of their critical vulnerability. In particular, we investigate four age estimation systems, including the online Microsoft service, two best state-of-the-art approaches advocated in the literature, as well as a novel age estimation algorithm. We analyse the effect of different internal and external factors, including gender, ethnicity, expression, makeup, illumination conditions, quality and resolution of the face images, on the performance of these age estimation systems. The goal of this sensitivity analysis is to provide the biometrics community with the insight and understanding of the critical subject-, camera- and environmental-based factors that affect the overall performance of the age estimation system under study.

Soroush Fatemifar, Muhammad Awais, Ali Akbari, Josef Kittler (2020)A Stacking Ensemble for Anomaly Based Client-Specific Face Spoofing Detection, In: 2020 IEEE International Conference on Image Processing (ICIP)pp. 1371-1375 IEEE

To counteract spoofing attacks, the majority of recent approaches to face spoofing attack detection formulate the problem as a binary classification task in which real data and attack-accesses are both used to train spoofing detectors. Although the classical training framework has been demonstrated to deliver satisfactory results, its robustness to unseen attacks is debatable. Inspired by the recent success of anomaly detection models in face spoofing detection, we propose an ensemble of one-class classifiers fused by a Stacking ensemble method to reduce the generalisation error in the more realistic unseen attack scenario. To be consistent with this scenario, anomalous samples are considered neither for training the component anomaly classifiers nor for the design of the Stacking ensemble. To achieve better face-anti spoofing results, we adopt client-specific information to build both constituent classifiers as well as the Stacking combiner. Besides, we propose a novel 2-stage Genetic Algorithm to further improve the generalisation performance of Stacking ensemble. We evaluate the effectiveness of the proposed systems on publicly available face anti-spoofing databases including Replay-Attack, Replay-Mobile and Rose-Youtu. The experimental results following the unseen attack evaluation protocol confirm the merits of the proposed model.