Devising planning algorithms for autonomous driving is non-trivial due to the presence of complex and uncertain interaction dynamics between road users. In this paper, we introduce a planning framework encompassing multiple action policies that are learned jointly from episodes of human-human interactions in naturalistic driving. The policy model is composed of encoder-decoder recurrent neural networks for modeling the sequential nature of interactions and mixture density networks for characterizing the probability distributions over driver actions. The model is used to simultaneously generate a finite set of context-dependent candidate plans for an autonomous car and to anticipate the probable future plans of human drivers. This is followed by an evaluation stage to select the plan with the highest expected utility for execution. Our approach leverages rapid sampling of action distributions in parallel on a graphic processing unit, offering fast computation even when modeling the interactions among multiple vehicles and over several time steps. We present ablation experiments and comparison with two existing baseline methods to highlight several design choices that we found to be essential to our model's success. We test the proposed planning approach in a simulated highway driving environment, showing that by using the model, the autonomous car can plan actions that mimic the interactive behavior of humans.
Quantifying and encoding occupants’ preferences as an objective function for the tactical decision making of autonomous vehicles is a challenging task. This paper presents a low-complexity approach for lane-change initiation and planning to facilitate highly automated driving on freeways. Conditions under which human drivers find different manoeuvres desirable are learned from naturalistic driving data, eliminating the need for an engineered objective function and incorporation of expert knowledge in form of rules. Motion planning is formulated as a finite-horizon optimisation problem with safety constraints. It is shown that the decision model can replicate human drivers’ discretionary lane-change decisions with up to 92% accuracy. Further proof of concept simulation of an overtaking manoeuvre is shown, whereby the actions of the simulated vehicle are logged while the dynamic environment evolves as per ground truth data recordings.