Let me first quote what is Behaviour Cloning?
Behavioural cloning is a method by which human sub-cognitive skills can be captured and reproduced in a computer program. As the human subject performs the skill, his or her actions are recorded along with the situation that gave rise to the action. A log of these records is used as input to a learning program. The learning program outputs a set of rules that reproduce the skilled behaviour. This method can be used to construct automatic control systems for complex tasks for which classical control theory is inadequate. It can also be used for training.
With above quote let’s start and try to understand how this can be done. But First the background about this project. This is my third project for Udacity Self-Driving car Nanodegree. This whole project is done in python with keras (Tensorflow as in background). I assume that readers of this post are familiar with keras and basics of Neural Network (Basically Convolutional Neural Network). If one want to follow along then you might gpu enabled system and udacity self-driving car simulator (Linux, macOS, Windows).
Dataset (Data Collection)
Data was collected with udacity self-driving car simulator. The Udacity simulator has two modes, training and autonomous, and two tracks. Training mode is to log the data for learning the driving behaviour. To do this I drove the car on track 1 keeping the car at centre of the lane and logged the data for 3 laps.
But one must understand what is meant to log the data here. So, let me explain what data is being logged (captured) and how.
Data capturing in a simulator is inspired by Nvidia paper, according to this paper, a car is mounted with 3 front cameras (one on right of bonet, one on centre and one at the left of bonet). Below image describes the setup of cameras and steering control (Image from Nvidia Paper).
data collection system view
With above camera setup, udacity simulator collects three different images and corresponding steering angle and logs into a folder with csv file having each image location with the steering angle.
Data Collection Strategy
I followed following strategy to collect required data.
- Drove on track 1 for three lap making car at centre of lane.
- Drove on track 1 for one lap in opposite direction of lane (to reduce the turn bias).
- For multiple time drove car to left of lane and the to centre of lane (This will make network learn if car goes out of centre lane then how to come back to centre lane). same for Right of lane to centre lane.
- Multiple smooth turns data at each turn.
After collecting data with above-mentioned strategies I ended up with 34074 images. But looking at steering angle distribution it looks like most of the image have zero or close to zero steering angle (For an example car is driving on the straight lane).
Steering angle distribution
The Naive approach I took is to remove all the images with steering angle with zero (didn’t remove images with an angle close to zero). Let’s look the distribution of speed.
Distribution of speed
There are some images which have a speed less than 20 miles/sec, I removed those images and kept only with speed more than 20 miles/sec (the intuition behind this is that there might be the case with lower speed car might be turning and may not be on the centre lane).
After removing all the zero steering angle images I left with less than ~5000 images which are not enough to learn the driving pattern. To overcome this problem I followed following Image technique to generate more data.
Verticale Random Shift
For each image, I randomly shift each image vertically between -20 and +20%. This vertical shift helps the model perform significantly better on both tracks and generalise more.
For each image, I horizontally flipped the image and at the same time change the sign of steering angle. This helps me to double the dataset.
I added a random shadow on each image so that it can generalise more for other tracks also.
let’s look how image looks like after applying random shadow.
I started with Nvidia CNN architecture and made the reference model for further improvement.
With Nvidia architecture, I added one more fully connected layer (My motivation was here to learn the pattern on track 1 with minimum epoch).
Above model can be programmed into keras (I used ELU activation layer, it converges faster for regression problems, plus it introduce non-linearity in network while learning)
Let me show you how this setup is done with the steering wheel (Image from Nvidia paper).
Training network setup
I used adam optimizer which adjust the learning rate with a moment. I train this network with 4 epoch and surprisingly it learn to drive on track 1, Impressing!!
Training & Generator
Since we’re dealing with image data, we can’t load it all up into memory at the same time, so we use Kera’s awesome
fit_generator function. This function accepts a Python generator as an argument, which yields the data.
This generator randomly chooses
batch_size samples from our X/y pairs. I empirically found that the batch_size of 64 works best as compared to 32, 128 or 256.
Following video shows the performance of steering angle.
This model flawlessly drove the car on track 1 with only 4 epoch. It’s really quite amazing how quickly the model is able to perform on this track.
This project gave me lots of learning especially on the side of data collection and image generator (the cool thing to learn).
Clearly, this is a very basic example of end-to-end learning for self-driving cars, and it has given a rough idea of what these models are capable of
Have suggestions for how I could improve my solution? I’d love to hear it. Thanks for reading!