Tensorflow CIFAR-100 Series (4)

Depth-wise separable convolution

Posted by Omar Mohamed on April 19, 2019

Recently by the same author:


Ultimate Image Classification

Scene recognition walkthrough


Omar Mohamed

Machine learning / Software developer

You may find interesting:


Tensorflow CIFAR-100 Series (7)

ResNets


Tensorflow CIFAR-100 Series (7)

ResNets

In part_2 we trained our simple model and saved it in saved_model folder. But as you may have noticed, training a convolutional network takes a lot of time and resources. That is why in this tutorial we will introduce the concept of separable convolution, which is faster than normal convolution, and see how it compares.

Depth-wise separable convolution

Depth-wise separable convolution consists of two basic ideas: depth-wise convolution and point-wise convolution.

Depth-wise convolution

Let us first talk about the idea of depth-wise convolution. The idea is very simple, instead of the filters being of the same depth as the input you make a filter for every channel.

image

This is an image of a normal convolution with 256 filters. Now let’s see what depth-wise convolution looks like:

image

A filter for every dimension like we discussed earlier but what if we want to change the depth or increase it? That’s where point-wise convolution comes in.

Point-wise convolution

Point-wise convolution is a 1x1xD filter that is used after depth-wise convolution.

image

The image above uses 256 of these filters to get to the same shape of the normal convolution in the first image.

So this is separable depth-wise convolution briefly. If you want more information on why it is faster check out this link A Basic Introduction to Separable Convolutions

Implementation

Only one line in our whole implementation of part 2 will change:

    # method that runs one convolution block
    def run_conv_block(x, layer_name, filter_size, input_depth, output_depth):
        with tf.variable_scope(layer_name):
            conv = run_batch_norm(x)
            conv = tf.nn.separable_conv2d(conv,
                                          get_conv_weight("depth_wise", [filter_size, filter_size, input_depth, 1]),
                                          get_conv_weight("point_wise", [1, 1, input_depth, output_depth]),
                                          strides=[1, 1, 1, 1], padding='SAME')
            conv = tf.nn.max_pool(value=conv, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME')
            conv = run_batch_norm(conv)
            conv = tf.nn.relu(conv)
            conv = tf.nn.dropout(hidden, conv_keep_prob)

            return conv

And that’s it.

Results

Training set accuracy:

image

Test set accuracy:

image

Final training set accuracy: 99.4%
Final training set loss: 0.0448
Final test set accuracy: 51.3%
Final test set loss: 2.1067

The results are not better than normal convolution, but close nonetheless and much faster.

If you want to check the full state of the project until now click here to go the repository.
See you in part 5.