 Subscribe to DSC Newsletter

In the past year I have also worked with Deep Learning techniques, and I would like to share with you how to make and train a Convolutional Neural Network from scratch, using tensorflow. Later on we can use this knowledge as a building block to make interesting Deep Learning applications.

The contents of this blog-post is as follows:

1. Tensorflow basics:
• 1.1 Constants and Variables
• 1.2 Tensorflow Graphs and Sessions
• 1.3 Placeholders and feed_dicts
2. Neural Networks in Tensorflow
• 2.1 Introduction
• 2.3 Creating a (simple) 1-layer Neural Network:
• 2.4 The many faces of Tensorflow
• 2.5 Creating the LeNet5 CNN
• 2.6 How the parameters affect the outputsize of an layer
• 2.7 Adjusting the LeNet5 architecture
• 2.8 Impact of Learning Rate and Optimizer
3. Deep Neural Networks in Tensorflow
• 3.1 AlexNet
• 3.2 VGG Net-16
• 3.3 AlexNet Performance
4. Final words

1. Tensorflow basics:

Here I will give a short introduction to Tensorflow for people who have never worked with it before. If you want to start building Neural Networks immediatly, or you are already familiar with Tensorflow you can go ahead and skip to section 2. If you would like to know more about Tensorflow, you can also have a look at this repository, or the notes of lecture 1 and lecture 2 of Stanford’s CS20SI course.

1.1 Constants and Variables

The most basic units within tensorflow are Constants, Variables and Placeholders.

The difference between a tf.constant() and a tf.Variable() should be clear; a constant has a constant value and once you set it, it cannot be changed.  The value of a Variable can be changed after it has been set, but the type and shape of the Variable can not be changed.

Besides the tf.zeros() and tf.ones(), which create a Tensor initialized to zero or one , there is also the tf.random_normal() function which create a tensor filled with values picked randomly from a normal distribution (the default distribution has a mean of 0.0 and stddev of 1.0).
There is also the tf.truncated_normal() function, which creates an Tensor with values randomly picked from a normal distribution, where two times the standard deviation forms the lower and upper limit.

With this knowledge, we can already create weight matrices and bias vectors which can be used in a neural network.

1.2. Tensorflow Graphs and Sessions

In Tensorflow, all of the different Variables and the operations done on these Variables are saved in a Graph. After you have build a Graph which contains all of the computational steps necessary for your model, you can run this Graph within a Session. This Session then distributes all of the computations across the available CPU and GPU resources.

1.3 Placeholders and feed_dicts

We have seen the various forms in which we can create constants and variables. Tensorflow also has placeholders; these do not require an initial value and only serve to allocate the necessary amount of memory. During a session, these placeholder can be filled in with (external) data with a feed_dict.

Below is an example of the usage of a placeholder.

2. Neural Networks in Tensorflow

2.1 Introduction The graph containing the Neural Network (illustrated in the image above) should contain the following steps:

1. The input datasets; the training dataset and labels, the test dataset and labels (and the validation dataset and labels).
The test and validation datasets can be placed inside a tf.constant(). And the training dataset is placed in a tf.placeholder() so that it can be feeded in batches during the training (stochastic gradient descent).
2. The Neural Network model with all of its layers. This can be a simple fully connected neural network consisting of only 1 layer, or a more complicated neural network consisting of 5, 9, 16 etc layers.
3. The weight matrices and bias vectors defined in the proper shape and initialized to their initial values. (One weight matrix and bias vector per layer.)
4. The loss value: the model has as output the logit vector (estimated training labels) and by comparing the logit with the actual labels, we can calculate the loss value (with the softmax with cross-entropy function). The loss value is an indication of how close the estimated training labels are to the actual training labels and will be used to update the weight values.
5. An optimizer, which will use the calculated loss value to update the weights and biases with backpropagation.

To read original post click here

Views: 7350

Comment

Join Data Science Central

Videos

• DSC Webinar Series: Some Great Ways to Visualize Survey Data (and Virtually any Type of Data)

Added by Tim Matteson

• DSC Webinar Series: Democratizing Analytics and DS for Continuous Intelligence

Added by Tim Matteson

• DSC Webinar Series: Mathematical Optimization + ML: Featuring Forrester Survey Insights

Added by Tim Matteson