Matthew Vowels

Associate Lecturer in Electroacoustics, PhD Researcher in A.I. in the Centre for Computer Vision, Speech and Signal Processing (CVSSP)
M.S., MSc, BMus (Hons) Tonmeister


Research interests

My publications


Vowels Matthew, Camgöz Necati Cihan, Bowden Richard (2020) Gated Variational AutoEncoders:
Incorporating Weak Supervision to Encourage Disentanglement
15th IEEE International Conference on Automatic Face and Gesture Recognition
Variational AutoEncoders (VAEs) provide a means
to generate representational latent embeddings. Previous research has highlighted the benefits of achieving representations that are disentangled, particularly for downstream tasks. However, there is some debate about how to encourage disentanglement with VAEs, and evidence indicates that existing implementations do not achieve disentanglement consistently. The evaluation of how well a VAE?s latent space has been disentangled is often evaluated against our subjective expectations of which attributes should be disentangled for a given problem. Therefore, by definition, we already have domain knowledge of what should be achieved and yet we use
unsupervised approaches to achieve it. We propose a weakly supervised approach that incorporates any available domain knowledge into the training process to form a Gated-VAE. The process involves partitioning the representational embedding
and gating backpropagation. All partitions are utilised on the forward pass but gradients are backpropagated through different partitions according to selected image/target pairings. The approach can be used to modify existing VAE models such as beta-VAE, InfoVAE and DIP-VAE-II. Experiments demonstrate that using gated backpropagation, latent factors are represented in their intended partition. The approach is applied to images of faces for the purpose of disentangling
head-pose from facial expression. Quantitative metrics
show that using Gated-VAE improves average disentanglement, completeness and informativeness, as compared with un-gated implementations. Qualitative assessment of latent traversals
demonstrate its disentanglement of head-pose from expression, even when only weak/noisy supervision is available.
Vowels Matthew, Camgöz Necati Cihan, Bowden Richard (2020) Nested VAE:Isolating Common Factors via Weak Supervision,15th IEEE International Conference on Automatic Face and Gesture Recognition
Fair and unbiased machine learning is an important and active ?eld of research, as decision processes are increasingly driven by models that learn from data. Unfortunately, any biases present in the data may be learned by the model, thereby inappropriately transferring that bias into the decision making process. We identify the connection between the task of bias reduction and that of isolating factors common between domains whilst encouraging domain speci?c invariance. To isolate the common factors we combine the theory of deep latent variable models with information bottleneck theory for scenarios whereby data may be naturally paired across domains and no additional supervision is required. The result is the Nested Variational AutoEncoder (NestedVAE). Two outer VAEs with shared weights attempt to reconstruct the input and infer a latent space, whilst a nested VAE attempt store construct the latent representation of one image,from the latent representation of its paired image. In so doing,the nested VAE isolates the common latent factors/causes and becomes invariant to unwanted factors that are not shared between paired images. We also propose a new metric to provide a balanced method of evaluating consistency and classi?er performance across domains which we refer to as the Adjusted Parity metric. An evaluation of Nested VAE on both domain and attribute invariance, change detection,and learning common factors for the prediction of biological sex demonstrates that NestedVAE signi?cantly outperforms alternative methods.
Vowels Matthew, Mason Russell (2020) Comparison of pairwise dissimilarity and
projective mapping tasks with auditory stimuli
Journal of the Audio Engineering Society Audio Engineering Society
Two methods for undertaking subjective evaluation were compared: a pairwise dissimilarity
task (PDT) and a projective mapping task (PMT). For a set of unambiguous, synthetic, auditory stimuli the aim was to determine: whether the PMT limits the recovered dimensionality to two dimensions; how subjects respond using PMT?s two-dimensional response format; the
relative time required for PDT and PMT; and hence whether PMT is an appropriate alternative
to PDT for experiments involving auditory stimuli. The results of both Multi-Dimensional
Scaling (MDS) analyses and Multiple Factor Analyses (MFA) indicate that, with multiple participants,
PMT allows for the recovery of three meaningful dimensions. The results from the MDS and MFA analyses of the PDT data, on the other hand, were ambiguous and did not
enable recovery of more than two meaningful dimensions. This result was unexpected given
that PDT is generally considered not to limit the dimensionality that can be recovered. Participants
took less time to complete the experiment using PMT compared to PDT (a median
ratio of approximately 1:4), and employed a range of strategies to express three perceptual
dimensions using PMT?s two-dimensional response format. PMT may provide a viable and
efficient means to elicit up to 3-dimensional responses from listeners.

Additional publications