In part_6 we added dropblock support to our
model. In this tutorial we will build a simple resnet upon it and see the results.
ResNets
There are tons of sources that explain the concept of ResNets and skip connections very well. But very briefly, the idea is that when we build
a deeper neural net we expect the accuracy to keep increasing, but normally this doesn’t happen and in fact it worsens. The reason for this is that
when you go deeper problems like vanishing gradients really start to affect your model’s performance. So to solve this problem, skip connections were
added to the network to make it easy to ignore layers that will hurt the performance. So now adding
more layers will at least not hurt the performance and in better cases make it better.
The following is an image from Andrew Ng’s lecture on ResNets:
So all that we should do is add this skip connections by summing the activations of layer L with the logits of layer L+n(where n can be the skip
length) and feeding this sum to the activation function to get activations of layer L+n.
Implementation
First let’s see the parameters of our model:
Then we will be updating our convolution block function to use two convolution operations instead of one:
Then we will add another block called a residual block:
In the above function, all we do is surround the convolution blocks with a new variable scope and run two convolution blocks and sum their values
like we descibed above. Then run dropblock and max bool.
Now let’s have a look at the model function now:
We will be using 4 residual blocks followed by our three hidden layers.
Results
Training set accuracy:
Test set accuracy:
Final training set accuracy: 99.5%
Final training set loss: 0.0300
Final test set accuracy: 70.7%
Final test set loss: 1.2815
So we increased our accuracy with a lower number of training epochs.
If you want to check the full state of the project until now click here to go the repository.
See you in part 8.