Dr Sampo Kuutti

Postgraduate research student in end-to-end deep learning control for connected autonomous vehicles


Sampo Kuutti, Richard Bowden, Yaochu Jin, Phil Barber, Saber Fallah (2020)A Survey of Deep Learning Applications to Autonomous Vehicle Control, In: IEEE Transactions on Intelligent Transportation Systems IEEE

Designing a controller for autonomous vehicles capable of providing adequate performance in all driving scenarios is challenging due to the highly complex environment and inability to test the system in the wide variety of scenarios which it may encounter after deployment. However, deep learning methods have shown great promise in not only providing excellent performance for complex and non-linear control problems, but also in generalising previously learned rules to new scenarios. For these reasons, the use of deep learning for vehicle control is becoming increasingly popular. Although important advancements have been achieved in this field, these works have not been fully summarised. This paper surveys a wide range of research works reported in the literature which aim to control a vehicle through deep learning methods. Although there exists overlap between control and perception, the focus of this paper is on vehicle control, rather than the wider perception problem which includes tasks such as semantic segmentation and object detection. The paper identifies the strengths and limitations of available deep learning methods through comparative analysis and discusses the research challenges in terms of computation, architecture selection, goal specification, generalisation, verification and validation, as well as safety. Overall, this survey brings timely and topical information to a rapidly evolving field relevant to intelligent transportation systems.

Sampo Kuutti, Richard Bowden, Harita Joshi, Robert de Temple, Saber Fallah (2019)End-to-end Reinforcement Learning for Autonomous Longitudinal Control Using Advantage Actor Critic with Temporal Context, In: 2019 IEEE Intelligent Transportation Systems Conference IEEE

Reinforcement learning has been used widely for autonomous longitudinal control algorithms. However, many existing algorithms suffer from sample inefficiency in reinforcement learning as well as the jerky driving behaviour of the learned systems. In this paper, we propose a reinforcement learning algorithm and a training framework to address these two disadvantages of previous algorithms proposed in this field. The proposed system uses an Advantage Actor Critic (A2C) learning system with recurrent layers to introduce temporal context within the network. This allows the learned system to evaluate continuous control actions based on previous states and actions in addition to current states. Moreover, slow training of the algorithm caused by its sample inefficiency is addressed by utilising another neural network to approximate the vehicle dynamics. Using a neural network as a proxy for the simulator has significant benefit to training as it reduces the requirement for reinforcement learning to query the simulation (which is a major bottleneck) in learning and as both reinforcement learning network and proxy network can be deployed on the same GPU, learning speed is considerably improved. Simulation results from testing in IPG CarMaker show the effectiveness of our recurrent A2C algorithm, compared to an A2C without recurrent layers.

Sampo Kuutti, Saber Fallah, Konstantinos Katsaros, Mehrdad Dianati, F Mccullough, A Mouzakitis (2018)A Survey of the State-of-the-Art Localisation Techniques and Their Potentials for Autonomous Vehicle Applications, In: IEEE Internet of Things5(2)pp. 829-846 IEEE

For an autonomous vehicle to operate safely and effectively, an accurate and robust localisation system is essential. While there are a variety of vehicle localisation techniques in literature, there is a lack of effort in comparing these techniques and identifying their potentials and limitations for autonomous vehicle applications. Hence, this paper evaluates the state-of-the-art vehicle localisation techniques and investigates their applicability on autonomous vehicles. The analysis starts with discussing the techniques which merely use the information obtained from on-board vehicle sensors. It is shown that although some techniques can achieve the accuracy required for autonomous driving but suffer from the high cost of the sensors and also sensor performance limitations in different driving scenarios (e.g. cornering, intersections) and different environmental conditions (e.g. darkness, snow). The paper continues the analysis with considering the techniques which benefit from off-board information obtained from V2X communication channels, in addition to vehicle sensory information. The analysis shows that augmenting off-board information to sensory information has potential to design low-cost localisation systems with high accuracy and robustness however their performance depends on penetration rate of nearby connected vehicles or infrastructure and the quality of network service.

MARCO VISCA, SAMPO JUHANI KUUTTI, Roger Powell, YANG GAO, MOHAMMAD SABER FALLAH (2021)Deep Learning Traversability Estimator for Mobile Robots in Unstructured Environments, In: Towards Autonomous Robotic Systems 22nd Annual Conference, TAROS 2021, Lincoln, UK, September 8-10, 2021, Proceedings Springer Verlag

Terrain traversability analysis plays a major role in ensuring safe robotic navigation in unstructured environments. However, real-time constraints frequently limit the accuracy of online tests especially in scenarios where realistic robot-terrain interactions are complex to model. In this context, we propose a deep learning framework trained in an end-to-end fashion from elevation maps and trajectories to estimate the occurrence of failure events. The network is first trained and tested in simulation over synthetic maps generated by the OpenSimplex algorithm. The prediction performance of the Deep Learning framework is illustrated by being able to retain over 94% recall of the original simulator at 30% of the computational time. Finally, the network is transferred and tested on real elevation maps collected by the SEEKER consortium during the Martian rover test trial in the Atacama desert in Chile. We show that transferring and fine-tuning of an application-independent pre-trained model retains better performance than training uniquely on scarcely available real data.

Sampo Kuutti, Saber Fallah, Richard Bowden (2020)Training Adversarial Agents to Exploit Weaknesses in Deep Control Policies, In: 2020 IEEE International Conference on Robotics and Automation (ICRA)pp. 108-114 IEEE

Deep learning has become an increasingly common technique for various control problems, such as robotic arm manipulation, robot navigation, and autonomous vehicles. However, the downside of using deep neural networks to learn control policies is their opaque nature and the difficulties of validating their safety. As the networks used to obtain state-of-the-art results become increasingly deep and complex, the rules they have learned and how they operate become more challenging to understand. This presents an issue, since in safety-critical applications the safety of the control policy must be ensured to a high confidence level. In this paper, we propose an automated black box testing framework based on adversarial reinforcement learning. The technique uses an adversarial agent, whose goal is to degrade the performance of the target model under test. We test the approach on an autonomous vehicle problem, by training an adversarial reinforcement learning agent, which aims to cause a deep neural network-driven autonomous vehicle to collide. Two neural networks trained for autonomous driving are compared, and the results from the testing are used to compare the robustness of their learned control policies. We show that the proposed framework is able to find weaknesses in both control policies that were not evident during online testing and therefore, demonstrate a significant benefit over manual testing methods.

Sampo Kuutti, Richard Bowden, Saber Fallah (2021)Weakly Supervised Reinforcement Learning for Autonomous Highway Driving via Virtual Safety Cages, In: Sensors (Basel, Switzerland)21(6)2032 MDPI

The use of neural networks and reinforcement learning has become increasingly popular in autonomous vehicle control. However, the opaqueness of the resulting control policies presents a significant barrier to deploying neural network-based control in autonomous vehicles. In this paper, we present a reinforcement learning based approach to autonomous vehicle longitudinal control, where the rule-based safety cages provide enhanced safety for the vehicle as well as weak supervision to the reinforcement learning agent. By guiding the agent to meaningful states and actions, this weak supervision improves the convergence during training and enhances the safety of the final trained policy. This rule-based supervisory controller has the further advantage of being fully interpretable, thereby enabling traditional validation and verification approaches to ensure the safety of the vehicle. We compare models with and without safety cages, as well as models with optimal and constrained model parameters, and show that the weak supervision consistently improves the safety of exploration, speed of convergence, and model performance. Additionally, we show that when the model parameters are constrained or sub-optimal, the safety cages can enable a model to learn a safe driving policy even when the model could not be trained to drive through reinforcement learning alone.

Sampo Kuutti, Richard Bowden, Harita Joshi, Robert de Temple, Saber Fallah (2019)Safe Deep Neural Network-driven Autonomous Vehicles Using Software Safety Cages, In: Proceedings of the 20th International Conference on Intelligent Data Engineering and Automated Learning (IDEAL 2019) Springer International Publishing

Deep learning is a promising class of techniques for controlling an autonomous vehicle. However, functional safety validation is seen as a critical issue for these systems due to the lack of transparency in deep neural networks and the safety-critical nature of autonomous vehicles. The black box nature of deep neural networks limits the effectiveness of traditional verification and validation methods. In this paper, we propose two software safety cages, which aim to limit the control action of the neural network to a safe operational envelope. The safety cages impose limits on the control action during critical scenarios, which if breached, change the control action to a more conservative value. This has the benefit that the behaviour of the safety cages is interpretable, and therefore traditional functional safety validation techniques can be applied. The work here presents a deep neural network trained for longitudinal vehicle control, with safety cages designed to prevent forward collisions. Simulated testing in critical scenarios shows the effectiveness of the safety cages in preventing forward collisions whilst under normal highway driving unnecessary interruptions are eliminated, and the deep learning control policy is able to perform unhindered. Interventions by the safety cages are also used to re-train the network, resulting in a more robust control policy.