9:30am - 10:30am

Friday 9 December 2022

Hybrid Deep Neural Networks

PhD Viva Open Presentation by Dmitry Minskiy

All Welcome!


University of Surrey
back to all events

This event has passed


Hybrid Deep Neural Networks


Vision-capable AI algorithms have integrated deeply into our lives and are now used extensively in a broad range of areas. They ensure the safety of our streets by analysing endless CCTV feeds, revolutionise the way we drive with autonomous cars, help to keep us healthy and save our lives in hospitals. The fundamental element of this success is their ability to extract complex semantic information from visual data (images and videos), in particular, to recognise objects, events, actions and their context. This work mainly addresses two tasks: object classification and segmentation. These problems are central in computer vision; hence we believe that novel approaches able to advance these will inevitably benefit other areas as well.

Currently, deep learning offers the most effective solution as it enables complex multi-stage algorithms, including Neural Networks, to learn rich data representations with multiple layers of abstraction. They typically use annotated examples to learn the intricate structure of data making minimal assumptions about the problem. Convolution Neural Networks (CNNs), which use cascades of convolutional filters, non-linearities and pooling operations, have dominated the vision domain. Despite the undeniable advantages of CNNs, they exhibit certain limitations, in particular (i) they require large datasets to learn robust representations and may still fail to generalise, (ii) they offer poor interpretability, i.e. it is hard to explain the reasoning behind a network's decisions; and (iii) their computational complexity is high, especially for large networks.

Thus, in this work, we explore how using mathematically defined filters, which are well-understood and predictable, in conjunction with deep CNN architectures could help address the fundamental challenges of deep learning. Networks that employ hand-crafted filters in combination with learnt representations are known as hybrid networks. Broadly, this work addresses the challenge of scaling the application scope of hybrid approaches from a few bespoke cases in data-limited scenarios to a broad range of applications and ensuring hybrids' performance advantage regardless of the training data available.

We first propose a new evaluation framework aimed at minimising the optimisation bias and creating an environment for a fair comparison of the largest collection of hybrid networks. This framework is employed to comprehensively analyse existing hybrid networks and provide a holistic view of their behaviour and limitations, strengths and weaknesses. We consolidate this knowledge in the Hybrid Design Guide, which aims to provide practical support in designing efficient hybrid networks.

As a result, we identified that existing hybrid design paradigms are sub-optimal as they impose representation restrictions, leading to a performance deficit compared to conventional CNNs, particularly for bigger training datasets. Thus, we introduce a novel approach to designing hybrid networks - an inductive architectural paradigm. At the core of the novel design lies the concept of Hybrid Fusion Blocks, which facilitates the effective embedding of hand-crafted representations into existing network architectures. This design, “invisible” to the backbone network, makes our approach flexible and adaptable across various CNN architectures. Our experiments with three popular CNNs (EfficientNet, ResNeSt and U-Net) demonstrate that our inductive design is the first scattering-based design paradigm that allows hybrid networks to consistently outperform baseline architectures in image classification and segmentation tasks.

Finally, we validate our novel approach by tackling a demanding biological task of subcellular protein localisation. For this, we introduce a multi-stage system called Hybrid subCellular Protein Localisaser, which employs hybrid networks to solve the problem. We demonstrate that the properties of hybrid networks can help achieve a meaningful performance gain even when added to a complex system comprising a large number of Deep Neural Networks. We also present a range of techniques to aid weakly-labelled classification, including both task-specific (Visual Integrity Detector) and generic (CRA pseudo-labelling algorithm) methods that can help boost the automated analysis of immunofluorescence microscopy images.

Overall, we show that with the novel techniques proposed, it is possible to build large-scale practical hybrid architectures with superior performance. Hence, we debunk the commonly held view that such networks are only useful in niche environments and solely in data-limited application scenarios. Thus, we uncover an exciting future for hybrid networks and motivate further research and development in this area.

Attend the event

This is a free hybrid event open to everyone. You can attend via Zoom