In part_2 we trained our simple model and saved it in saved_model folder. But as you may have noticed, training a convolutional network takes a lot of time and resources. That is why in this tutorial we will introduce the concept of separable convolution, which is faster than normal convolution, and see how it compares.
Depth-wise separable convolution
Depth-wise separable convolution consists of two basic ideas: depth-wise convolution and point-wise convolution.
Depth-wise convolution
Let us first talk about the idea of depth-wise convolution. The idea is very simple, instead of the filters being of the same depth as the input you make a filter for every channel.
This is an image of a normal convolution with 256 filters. Now let’s see what depth-wise convolution looks like:
A filter for every dimension like we discussed earlier but what if we want to change the depth or increase it? That’s where point-wise convolution comes in.
Point-wise convolution
Point-wise convolution is a 1x1xD filter that is used after depth-wise convolution.
The image above uses 256 of these filters to get to the same shape of the normal convolution in the first image.
So this is separable depth-wise convolution briefly. If you want more information on why it is faster check out this link A Basic Introduction to Separable Convolutions
Implementation
Only one line in our whole implementation of part 2 will change:
And that’s it.
Results
Training set accuracy:
Test set accuracy:
Final training set accuracy: 99.4%
Final training set loss: 0.0448
Final test set accuracy: 51.3%
Final test set loss: 2.1067
The results are not better than normal convolution, but close nonetheless and much faster.
If you want to check the full state of the project until now click here to go the repository.
See you in part 5.