]]>

]]>

]]>

In a previous post I talked about how to preprocess and explore image dataset. In this post, I will talk about how to model image data with neural networks having a single neuron, using sigmoid function. Original version of this blog can be found here. This is equivalent to logistic regression. Only difference is the way we estimate weights(coeffcients) of the the inputs. The traditional way of estimating logistic regression weights(coefficients) is to use analytical methods(an optimization technique). But the neural network way of estimating weights(coefficients) is to use gradient descent algorithm. Before jumping to modeling, I will try to give an intuition about the sigmoid function. Sigmoid function intuitionThe sigmoid function is given by the formula,For any input x, a(sigmoid of x) will vary between 0 and 1. When x is positive and large, e^x(numerator) and 1+e^x(denominator) will be approximately same and value of a will be one. Similarly when x is a large negative number, e^x will be approximately zero and value of a will be zero. Let's see two examples.import osimport numpy as npfrom scipy.misc import imresizeimport matplotlib.pyplot as plt%matplotlib inlinex=500 print(1/(1+np.exp(-x))) Output:1.0 x=-500 print(1/(1+np.exp(-x))) Output:7.12457640674e-218 Another important aspect of sigmoid function is that it is a non-linear function in x. This fact becomes more powerful in case of multi layered neural networks, as it will help in unlocking many hidden non-linear patterns in the data. A single sigmoid function looks like the following graph, for different values of x.x=np.linspace(-10,10,100) #linspace generates 100 uniformly spaced values between -10 and 10plt.figure(figsize=(10, 5)) #Setting up the figure size of width 10 and height 5plt.plot(x,sigmoid(x),'b') #Plot sigmoid(x) in Y-axis and x in X-axis with line color blueplt.grid() #Add grid to the plot plt.rc('axes', labelsize=13) #Set x label abd y label fontsize to 13plt.xlabel('x') #Label x-axis plt.ylabel('a (sigmoid(x))') #Label y axisplt.rc('font', size=15) #Set text fontsize default as 15plt.suptitle('Sigmoid Function') #Create a supertitle for the plot. You can use title as wellOutput: As you can see from the graph, a(sigmoid(x)) varies between 0 and 1. This makes sigmoid function and in turn logistic regression suitable for binomial classification problem. That means we can use logistic regression or sigmoid function when the target varible has only two values(0 or 1). This makes it suitable for our purpose, in which we are trying to predict the gender of the celebrity from images. Gender(our target variable) has only two values in our dataset, male(0) and female(1).Sigmoid function essentially gives out the probabilty target variable being 1 for a given input. i.e in our case given an image, sigmoid function gives the probability of that image being that of a female celebrity, since in our target variable female gender is indicated as 1. Although, probabilty of an image being male can be easily calculated as 1-sigmoid(input image) will give that. Another point to remember is that, for our problem input x is a combination of variables or pixels to be precise. Let's denote this combination of input variables as z.where, w1 = weight of the first variable (in our case the first pixel) x1 = first variable (in our case, first pixel) and so on.. b = bias (similar to intercept in linear regression) where is the sigmoid function and a is the predicted values(probabilities) In matrix notation, the equations can be written as,where '.' indicates matrix multiplication W is the row vector of all weights of dimension[1,num_px] num_px is the number of pixels(variables) X is the input matrix of dimension[num_px,m] m = no.of training examples A is the array of predicted values of dimension[1,m]Cost FunctionThe unknowns in the above equations are weights(w's) and bias(b). The idea of logistic regression or single neuron neural network(from now on I will use this terminology) is to find the best values of weights and bias which gives the minimum error(cost).So for training the model first we have to define the cost function. We define the cost function for the binomial prediction aswhere, J(a,y) is the cost which is a function of a and y and it is a scalar meaning single value. This cost is called negative log likelihood. Lower the cost, better the model m = number of training examples y = array of true labels or actual values a = , the predicted values z = w1x1 + w2x2 +...+w_n*x_n + bIn matrix form we write it as,where, m is the number of training examples is the transpose of A which is the array of predicted values of dimensions [m,1] Y is the array actual values or true labels of dimensions [1,m]Steps to train a single neuronNow we have to use gradient descent to find the values of W and b that minimizes the cost. In short, training of single neuron neural network using gradient descent involves the following steps:1) Initialize parameters i.e W and b2) Forward Propagation: Calculate Z and A using the initialized parameters3) Compute cost4) Backward propagation: Take gradient(derivative) of cost function with respect to W and b5) Use the gradients to update the values of W and b6) Repeat steps 2 to 5 for a fixed number of timesForward PropagationIn steps 2 and 3, we calculate the values of A and Z as mentioned before and compute the cost. This step is called forward propagation.dZ = = A-YdW = = db = = where is the transpose of X. In the above diagram, backward propagation is highlighted by red colored line. From the point of view of logical flow of the network, backward propagation starts from the cost and reaches W. The intuition is we need to update the parameters(W and b) of the model to minimze cost, and in order to do that we need to find the derivative of cost w.r.t the parameters we want to update. However, cost is not directly dependent on parameter(W and b) but on functions(A and Z) which uses these parameters. Hence we need to use chain rule to calculate the derivative of cost w.r.t to parameters. Each derivative term in the chain rule happens at a different part in the model, which starts at cost and flows backward. Parameter UpdatesIn step 5, we need to update the parameters as followsHere is a parameter called learning rate. It controls how big the update(or step) is in each iteration. If is too small, it may take a long time to find the best parameters and if is too big we may overshoot and never reach the optimal parameters. In step 6, we need to repeat the steps a fixed number of times. There is no rule as such how many iterations we have to run. It varies from dataset to dataset. If we set alpha to a very small value, we may need to iterate more number of times. Generally it's a hyperparameter which we have to tune.That's all we need to know to implement a single neuron neural network.So to reiterate the steps involved:1) Initialize parameters i.e W and b2) Forward Propagation: Calculate Z and A using the initialized parameters3) Compute cost4) Backward propagation: Take gradient(derivative) of cost function with respect to W and b5) Use the gradients to update the values of W and b6) Repeat steps 2 to 5 for a fixed number of times Single Neuron ImplementationI will continue from where I stopped in the last article. I will continue with the same problem and same dataset.Our problem statement was to predict the gender of the celebrity from the image.After preprocessing, our final data sets were train_x(train data input) , y_train(target variable for the training set), test_x(test data input) , y_test(target variable for the testing set).Let's take a quick look at the data attributes.m_train = train_x.shape[1] m_test = y_test.shape[1] num_px = train_x_orig.shape[1] print ("Number of training examples: m_train = " + str(m_train)) print ("Number of testing examples: m_test = " + str(m_test)) print ("Height/Width of each image: num_px = " + str(num_px)) print ("Each image is of size: (" + str(num_px) + ", " + str(num_px) + ", 3)") print ("train_x shape: " + str(train_x.shape)) print ("y_train shape: " + str(y_train.shape)) print ("test_x shape: " + str(test_x.shape)) print ("y_test shape: " + str(y_test.shape))Output:Number of training examples: m_train = 80 Number of testing examples: m_test = 20 Height/Width of each image: num_px = 64 Each image is of size: (64, 64, 3) train_x shape: (12288, 80) y_train shape: (1, 80) test_x shape: (12288, 20) y_test shape: (1, 20) Step 1) Initialize parameters i.e W and b Let's write a function to initialize W and b. There are different intialization techniques. For this exercise, we will intialize both W and b to zero.def initialize_with_zeros(dim): #Function takes in a parameter dim whic is equal to no of columns or pixels in the dataset w = np.zeros((1,dim)) b = 0 assert(w.shape == (1, dim)) #Assert statement ensures W and b has the required shape assert(isinstance(b, float) or isinstance(b, int)) return w, b Steps 2, 3 and 4 Forward Propagation, Cost computation and Backward propagation We will define a sigmoid function first, which will take any array or vector as an input and returns the sigmoid of the input.def sigmoid(z):s = 1/(1+np.exp(-z)) return s Now let's write a function called propagate, which will take W(weights),b(bias),X(input matrix) and Y(target variable) as inputs. It should return cost and gradients dW and db. We need to calculate the following:A= = Cost = dW = = db = = where '.' indicates matrix multiplication. In python, np.dot(numpy.dot) function is used for matrix multiplication.def propagate(w, b, X, Y): """ Arguments: w -- weights, a numpy array of size (num_px * num_px * 3, 1) b -- bias, a scalar X -- data of size (num_px * num_px * 3, number of examples) Y -- true "label" vector (containing 0 if male celebrity, 1 if female celebrity) of size (1, number of examples)Return: cost -- negative log-likelihood cost for logistic regression dw -- gradient of the loss with respect to w, thus same shape as w db -- gradient of the loss with respect to b, thus same shape as b """ m = X.shape[1] # FORWARD PROPAGATION (FROM X TO COST) A = sigmoid(np.dot(w,X)+b) # compute sigmoid- np.dot is used for matrix multiplication cost = (-1/m)*(np.dot(Y,np.log(A.T))+ np.dot((1-Y),np.log((1-A).T))) # compute cost # BACKWARD PROPAGATION (TO FIND GRAD) dw = (1/m)*np.dot((A-Y),X.T) db = (1/m)*np.sum((A-Y)) assert(dw.shape == w.shape) assert(db.dtype == float) cost = np.squeeze(cost) #to make cost a scalar i.e a single value assert(cost.shape == ()) grads = {"dw": dw, "db": db} return grads, cost Steps 5 and 6 Optimization:Update parameters and iterate Let's define a function optimize which will repeat steps 2 through 5 for a given number of times.Steps 2 till 4 can be calculated by calling the propagate function. We need to define step 5 here. i.e parameter updates. Update rules are:were is the learning rate.After iterating through the given number of iterations, this function should return the final weights and biasdef optimize(w, b, X, Y, num_iterations, learning_rate, print_cost = False): """ This function optimizes w and b by running a gradient descent algorithm Arguments: w -- weights, a numpy array of size (num_px * num_px * 3, 1) b -- bias, a scalar X -- data of shape (num_px * num_px * 3, number of examples) Y -- true "label" vector (containing 0 if male celebrity, 1 if female celebrity) of size (1, number of examples) num_iterations -- number of iterations of the optimization loop learning_rate -- learning rate of the gradient descent update rule print_cost -- True to print the loss every 100 steps Returns: params -- dictionary containing the weights w and bias b grads -- dictionary containing the gradients of the weights and bias with respect to the cost function costs -- list of all the costs computed during the optimization, this will be used to plot the learning curve. """ costs = [] for i in range(num_iterations): #This will iterate i from 0 till num_iterations-1 # Cost and gradient calculation grads, cost = propagate(w, b, X, Y) # Retrieve derivatives from grads dw = grads["dw"] db = grads["db"] # update rule w = w-learning_rate*dw b = b-learning_rate*db # Record the costs for every 100th iteration if i % 100 == 0: costs.append(cost) # Print the cost every 100 training examples if print_cost and i % 100 == 0: print ("Cost after iteration %i: %f" %(i, cost)) # plot the cost plt.rcParams['figure.figsize'] = (10.0, 10.0) plt.plot(np.squeeze(costs)) plt.ylabel('cost') plt.xlabel('iterations (per hundreds)') plt.title("Learning rate =" + str(learning_rate)) plt.show() params = {"w": w, "b": b} grads = {"dw": dw, "db": db} return params, grads, costs Prediction using learned parametersFrom the previous function we will get the final weights and bias. We can use those weights to predict the target variable(gender) on new data(test data). Let's define a function for prediction capability. If the predicted probability is 0.5 or less, the image will be calssified as 0(male) else 1(female).def predict(w, b, X): ''' Predict whether the label is 0 or 1 using learned logistic regression parameters (w, b) Arguments: w -- weights, a numpy array of size (num_px * num_px * 3, 1) b -- bias, a scalar X -- data of size (num_px * num_px * 3, number of examples) Returns: Y_prediction -- a numpy array (vector) containing all predictions (0/1) for the examples in X ''' m = X.shape[1] Y_prediction = np.zeros((1,m)) #w = w.reshape(X.shape[0], 1) # Compute vector "A" predicting the probabilities of having a female celebrity in the picture A = sigmoid(np.dot(w,X)+b) Y_prediction=np.round(A) assert(Y_prediction.shape == (1, m)) return Y_prediction Putting everything togetherLet's put training and prediction into a sigle function called model, which will train the model on training data and predict on testing data and return accuracy of the model. Since we have to predict 0 or 1, we can calculate accuray using the formula:It indicates what percentage of images have been rightly classified or predicted. You can define any accuracy or evaluation metrics. However, in this series we will use accuracy defined as above.def model(X_train, Y_train, X_test, Y_test, num_iterations = 2000, learning_rate = 0.5, print_cost = False): """ Builds the logistic regression model by calling the function implemented previously Arguments: X_train -- training set represented by a numpy array of shape (num_px * num_px * 3, m_train) Y_train -- training labels represented by a numpy array (vector) of shape (1, m_train) X_test -- test set represented by a numpy array of shape (num_px * num_px * 3, m_test) Y_test -- test labels represented by a numpy array (vector) of shape (1, m_test) num_iterations -- hyperparameter representing the number of iterations to optimize the parameters learning_rate -- hyperparameter representing the learning rate used in the update rule of optimize() print_cost -- Set to true to print the cost every 100 iterations Returns: d -- dictionary containing information about the model. """ # initialize parameters with zeros m_train=X_train.shape[0] w, b = initialize_with_zeros(m_train)# Gradient descent parameters, grads, costs = optimize(w, b, X_train, Y_train, num_iterations= num_iterations, learning_rate = learning_rate, print_cost = print_cost) # Retrieve parameters w and b from dictionary "parameters" w = parameters["w"] b = parameters["b"] # Predict test/train set examples Y_prediction_test = predict(w, b, X_test) Y_prediction_train = predict(w, b, X_train)# Print train/test Errors print("train accuracy: {} %".format(100*(1 - np.mean(np.abs(Y_prediction_train - Y_train)) ))) print("test accuracy: {} %".format(100*(1 - np.mean(np.abs(Y_prediction_test - Y_test)) ))) d = {"costs": costs, "Y_prediction_test": Y_prediction_test, "Y_prediction_train" : Y_prediction_train, "w" : w, "b" : b, " learning_rate" : learning_rate, "num_iterations": num_iterations} return d<Python Code End><Python Code Start>d = model(train_x, y_train, test_x, y_test, num_iterations = 1000, learning_rate = 0.005, print_cost = True) Output:Cost after iteration 0: 0.693147 Cost after iteration 100: 0.325803 Cost after iteration 200: 0.209219 Cost after iteration 300: 0.159637 Cost after iteration 400: 0.128275 Cost after iteration 500: 0.106781 Cost after iteration 600: 0.091209 Cost after iteration 700: 0.079450 Cost after iteration 800: 0.070282 Cost after iteration 900: 0.062948 train accuracy: 100.0 % test accuracy: 65.0 % The accuracy of the model is around 65% with learning rate =0.005 and number of iterations =1000. Probably we can achieve bit more better results by tuning these two parameters.Now, let's take a look at the mis labeled or wrongly predicted images.def print_mislabeled_images(classes, X, y, p): """ Plots images where predictions and truth were different. X -- dataset y -- true labels p -- predictions """ a = p + y mislabeled_indices = np.asarray(np.where(a == 1)) plt.rcParams['figure.figsize'] = (40.0, 40.0) # set default size of plots num_images = len(mislabeled_indices[0]) for i in range(num_images): index = mislabeled_indices[1][i] plt.subplot(2, num_images, i + 1) plt.imshow(X[:,index].reshape(64,64,3), interpolation='sinc') plt.axis('off') plt.rc('font', size=20) plt.title("Prediction: " + classes[int(p[0,index])] + " \n Class: " + classes[y[0,index]])print_mislabeled_images(classes, test_x, y_test, d["Y_prediction_test"]) Output:So now we have completed training a single node neural network. We have achieved an accuracy of 65 %. Not bad for a single neuron or simple logistic regression. It's a bit long post but understanding the basics is the key to understand more complex algorithms. Sigmoid function(or similar functions) is the building block for Neural Networks, Deep learning and AI. I hope this article gave a good intuition about the sigmoid function and neural network approach.Building on top of this article, in the next post, I will talk about how to train a multi layer neural network. Before wrapping up, I will try to show what the neuron has learned at the end of training. Now this part is not for weak hearted people. Continue only if you are brave and curious :D.Let's use the final weights to multiply corresponding pixels in training data and scale by a factor 255, since we divided pixels by 255 for standardization.Now let's plot an image from the reconstructed data.test=d["w"].T*train_x*255test=test.T.reshape(80,64,64,3)plt.rcParams['figure.figsize'] = (10.0, 10.0)plt.imshow(test[0], interpolation='sinc')Output:You may either find the image artistic or scary or weird. Neverthless it's still very interesting, atleast for me ;). For plotting the above image I used sinc interpolation. We can try different interpolations and see the effects. methods = [None, 'none', 'nearest', 'bilinear', 'bicubic', 'spline16', 'spline36', 'hanning', 'hamming', 'hermite', 'kaiser', 'quadric', 'catrom', 'gaussian', 'bessel', 'mitchell', 'sinc', 'lanczos']# Fixing random state for reproducibility np.random.seed(19680801) fig, axes = plt.subplots(3, 6, figsize=(24, 12), subplot_kw={'xticks': [], 'yticks': []})fig.subplots_adjust(hspace=0.3, wspace=0.05)for ax, interp_method in zip(axes.flat, methods): plt.rc('font', size=15) ax.imshow(test[0], interpolation=interp_method, cmap=None) ax.set_title(interp_method)plt.show()Output: Let's create a montage and compare the reconstructed images vs original.def montage(images, saveto='montage.png'): """Draw all images as a montage separated by 1 pixel borders. Also saves the file to the destination specified by `saveto`. Parameters ---------- images : numpy.ndarray Input array to create montage of. Array should be: batch x height x width x channels. saveto : str Location to save the resulting montage image. Returns ------- m : numpy.ndarray Montage image. """ if isinstance(images, list): images = np.array(images) img_h = images.shape[1] img_w = images.shape[2] n_plots = int(np.ceil(np.sqrt(images.shape[0]))) if len(images.shape) == 4 and images.shape[3] == 3: m = np.ones( (images.shape[1] * n_plots + n_plots + 1, images.shape[2] * n_plots + n_plots + 1, 3)) * 0.5 else: m = np.ones( (images.shape[1] * n_plots + n_plots + 1, images.shape[2] * n_plots + n_plots + 1)) * 0.5 for i in range(n_plots): for j in range(n_plots): this_filter = i * n_plots + j if this_filter < images.shape[0]: this_img = images[this_filter] m[1 + i + i * img_h:1 + i + (i + 1) * img_h, 1 + j + j * img_w:1 + j + (j + 1) * img_w] = this_img #plt.imsave(arr=m, fname=saveto) return m The above function can be used to create montages. Now let's combine some of the reconstructed images and original data and create a montage.compare = np.concatenate((test[52:54], data[52:54]), axis=0) compare.shape Output:(4, 64, 64, 3) Now let us try to create the montage with two different interpolations.plt.imshow(montage(compare,saveto='montage.png'),interpolation='spline36')plt.show()plt.imshow(montage(compare,saveto='montage.png'),interpolation='bicubic')plt.show()Output:If you look carefully, in the reconstructed image, hair colors of the image have been captured differently. This is an indication that the algorithm has learned some of the facial features from the data.Also, other thing we can do is to generate the montage with different interpolations for comparison.methods = [None, 'none', 'nearest', 'bilinear', 'bicubic', 'spline16', 'spline36', 'hanning', 'hamming', 'hermite', 'kaiser', 'quadric', 'catrom', 'gaussian', 'bessel', 'mitchell', 'sinc', 'lanczos']# Fixing random state for reproducibility np.random.seed(19680801) fig, axes = plt.subplots(3, 6, figsize=(24, 12), subplot_kw={'xticks': [], 'yticks': []})fig.subplots_adjust(hspace=0.3, wspace=0.05)for ax, interp_method in zip(axes.flat, methods): ax.imshow(montage(compare,saveto='montage.png'), interpolation=interp_method, cmap=None) ax.set_title(interp_method)plt.show() Output: Images are very interesting. We can find very interesting patterns and visulaize how an algorithm learns to identify patterns in the image. It always amazes me. On that note I am putting my pen down on this article. In the next article, I will talk about multi layer neural networks and try to explore what the neurons have learned from the images. References: 'Neural Networks and Deep Learning' on Coursera by Andrew Ng Calculus by Gilbert StrangSee More

In a previous post I talked about how to preprocess and explore image dataset. In this post, I will talk about how to model image data with neural networks having a single neuron, using sigmoid function. Original version of this blog can be found here. This is equivalent to logistic regression. Only difference is the way we estimate weights(coeffcients) of the the inputs. The traditional way of estimating logistic regression weights(coefficients) is to use analytical methods(an optimization technique). But the neural network way of estimating weights(coefficients) is to use gradient descent algorithm. Before jumping to modeling, I will try to give an intuition about the sigmoid function. Sigmoid function intuitionThe sigmoid function is given by the formula,For any input x, a(sigmoid of x) will vary between 0 and 1. When x is positive and large, e^x(numerator) and 1+e^x(denominator) will be approximately same and value of a will be one. Similarly when x is a large negative number, e^x will be approximately zero and value of a will be zero. Let's see two examples.import osimport numpy as npfrom scipy.misc import imresizeimport matplotlib.pyplot as plt%matplotlib inlinex=500 print(1/(1+np.exp(-x))) Output:1.0 x=-500 print(1/(1+np.exp(-x))) Output:7.12457640674e-218 Another important aspect of sigmoid function is that it is a non-linear function in x. This fact becomes more powerful in case of multi layered neural networks, as it will help in unlocking many hidden non-linear patterns in the data. A single sigmoid function looks like the following graph, for different values of x.x=np.linspace(-10,10,100) #linspace generates 100 uniformly spaced values between -10 and 10plt.figure(figsize=(10, 5)) #Setting up the figure size of width 10 and height 5plt.plot(x,sigmoid(x),'b') #Plot sigmoid(x) in Y-axis and x in X-axis with line color blueplt.grid() #Add grid to the plot plt.rc('axes', labelsize=13) #Set x label abd y label fontsize to 13plt.xlabel('x') #Label x-axis plt.ylabel('a (sigmoid(x))') #Label y axisplt.rc('font', size=15) #Set text fontsize default as 15plt.suptitle('Sigmoid Function') #Create a supertitle for the plot. You can use title as wellOutput: As you can see from the graph, a(sigmoid(x)) varies between 0 and 1. This makes sigmoid function and in turn logistic regression suitable for binomial classification problem. That means we can use logistic regression or sigmoid function when the target varible has only two values(0 or 1). This makes it suitable for our purpose, in which we are trying to predict the gender of the celebrity from images. Gender(our target variable) has only two values in our dataset, male(0) and female(1).Sigmoid function essentially gives out the probabilty target variable being 1 for a given input. i.e in our case given an image, sigmoid function gives the probability of that image being that of a female celebrity, since in our target variable female gender is indicated as 1. Although, probabilty of an image being male can be easily calculated as 1-sigmoid(input image) will give that. Another point to remember is that, for our problem input x is a combination of variables or pixels to be precise. Let's denote this combination of input variables as z.where, w1 = weight of the first variable (in our case the first pixel) x1 = first variable (in our case, first pixel) and so on.. b = bias (similar to intercept in linear regression) where is the sigmoid function and a is the predicted values(probabilities) In matrix notation, the equations can be written as,where '.' indicates matrix multiplication W is the row vector of all weights of dimension[1,num_px] num_px is the number of pixels(variables) X is the input matrix of dimension[num_px,m] m = no.of training examples A is the array of predicted values of dimension[1,m]Cost FunctionThe unknowns in the above equations are weights(w's) and bias(b). The idea of logistic regression or single neuron neural network(from now on I will use this terminology) is to find the best values of weights and bias which gives the minimum error(cost).So for training the model first we have to define the cost function. We define the cost function for the binomial prediction aswhere, J(a,y) is the cost which is a function of a and y and it is a scalar meaning single value. This cost is called negative log likelihood. Lower the cost, better the model m = number of training examples y = array of true labels or actual values a = , the predicted values z = w1x1 + w2x2 +...+w_n*x_n + bIn matrix form we write it as,where, m is the number of training examples is the transpose of A which is the array of predicted values of dimensions [m,1] Y is the array actual values or true labels of dimensions [1,m]Steps to train a single neuronNow we have to use gradient descent to find the values of W and b that minimizes the cost. In short, training of single neuron neural network using gradient descent involves the following steps:1) Initialize parameters i.e W and b2) Forward Propagation: Calculate Z and A using the initialized parameters3) Compute cost4) Backward propagation: Take gradient(derivative) of cost function with respect to W and b5) Use the gradients to update the values of W and b6) Repeat steps 2 to 5 for a fixed number of timesForward PropagationIn steps 2 and 3, we calculate the values of A and Z as mentioned before and compute the cost. This step is called forward propagation.dZ = = A-YdW = = db = = where is the transpose of X. In the above diagram, backward propagation is highlighted by red colored line. From the point of view of logical flow of the network, backward propagation starts from the cost and reaches W. The intuition is we need to update the parameters(W and b) of the model to minimze cost, and in order to do that we need to find the derivative of cost w.r.t the parameters we want to update. However, cost is not directly dependent on parameter(W and b) but on functions(A and Z) which uses these parameters. Hence we need to use chain rule to calculate the derivative of cost w.r.t to parameters. Each derivative term in the chain rule happens at a different part in the model, which starts at cost and flows backward. Parameter UpdatesIn step 5, we need to update the parameters as followsHere is a parameter called learning rate. It controls how big the update(or step) is in each iteration. If is too small, it may take a long time to find the best parameters and if is too big we may overshoot and never reach the optimal parameters. In step 6, we need to repeat the steps a fixed number of times. There is no rule as such how many iterations we have to run. It varies from dataset to dataset. If we set alpha to a very small value, we may need to iterate more number of times. Generally it's a hyperparameter which we have to tune.That's all we need to know to implement a single neuron neural network.So to reiterate the steps involved:1) Initialize parameters i.e W and b2) Forward Propagation: Calculate Z and A using the initialized parameters3) Compute cost4) Backward propagation: Take gradient(derivative) of cost function with respect to W and b5) Use the gradients to update the values of W and b6) Repeat steps 2 to 5 for a fixed number of times Single Neuron ImplementationI will continue from where I stopped in the last article. I will continue with the same problem and same dataset.Our problem statement was to predict the gender of the celebrity from the image.After preprocessing, our final data sets were train_x(train data input) , y_train(target variable for the training set), test_x(test data input) , y_test(target variable for the testing set).Let's take a quick look at the data attributes.m_train = train_x.shape[1] m_test = y_test.shape[1] num_px = train_x_orig.shape[1] print ("Number of training examples: m_train = " + str(m_train)) print ("Number of testing examples: m_test = " + str(m_test)) print ("Height/Width of each image: num_px = " + str(num_px)) print ("Each image is of size: (" + str(num_px) + ", " + str(num_px) + ", 3)") print ("train_x shape: " + str(train_x.shape)) print ("y_train shape: " + str(y_train.shape)) print ("test_x shape: " + str(test_x.shape)) print ("y_test shape: " + str(y_test.shape))Output:Number of training examples: m_train = 80 Number of testing examples: m_test = 20 Height/Width of each image: num_px = 64 Each image is of size: (64, 64, 3) train_x shape: (12288, 80) y_train shape: (1, 80) test_x shape: (12288, 20) y_test shape: (1, 20) Step 1) Initialize parameters i.e W and b Let's write a function to initialize W and b. There are different intialization techniques. For this exercise, we will intialize both W and b to zero.def initialize_with_zeros(dim): #Function takes in a parameter dim whic is equal to no of columns or pixels in the dataset w = np.zeros((1,dim)) b = 0 assert(w.shape == (1, dim)) #Assert statement ensures W and b has the required shape assert(isinstance(b, float) or isinstance(b, int)) return w, b Steps 2, 3 and 4 Forward Propagation, Cost computation and Backward propagation We will define a sigmoid function first, which will take any array or vector as an input and returns the sigmoid of the input.def sigmoid(z):s = 1/(1+np.exp(-z)) return s Now let's write a function called propagate, which will take W(weights),b(bias),X(input matrix) and Y(target variable) as inputs. It should return cost and gradients dW and db. We need to calculate the following:A= = Cost = dW = = db = = where '.' indicates matrix multiplication. In python, np.dot(numpy.dot) function is used for matrix multiplication.def propagate(w, b, X, Y): """ Arguments: w -- weights, a numpy array of size (num_px * num_px * 3, 1) b -- bias, a scalar X -- data of size (num_px * num_px * 3, number of examples) Y -- true "label" vector (containing 0 if male celebrity, 1 if female celebrity) of size (1, number of examples)Return: cost -- negative log-likelihood cost for logistic regression dw -- gradient of the loss with respect to w, thus same shape as w db -- gradient of the loss with respect to b, thus same shape as b """ m = X.shape[1] # FORWARD PROPAGATION (FROM X TO COST) A = sigmoid(np.dot(w,X)+b) # compute sigmoid- np.dot is used for matrix multiplication cost = (-1/m)*(np.dot(Y,np.log(A.T))+ np.dot((1-Y),np.log((1-A).T))) # compute cost # BACKWARD PROPAGATION (TO FIND GRAD) dw = (1/m)*np.dot((A-Y),X.T) db = (1/m)*np.sum((A-Y)) assert(dw.shape == w.shape) assert(db.dtype == float) cost = np.squeeze(cost) #to make cost a scalar i.e a single value assert(cost.shape == ()) grads = {"dw": dw, "db": db} return grads, cost Steps 5 and 6 Optimization:Update parameters and iterate Let's define a function optimize which will repeat steps 2 through 5 for a given number of times.Steps 2 till 4 can be calculated by calling the propagate function. We need to define step 5 here. i.e parameter updates. Update rules are:were is the learning rate.After iterating through the given number of iterations, this function should return the final weights and biasdef optimize(w, b, X, Y, num_iterations, learning_rate, print_cost = False): """ This function optimizes w and b by running a gradient descent algorithm Arguments: w -- weights, a numpy array of size (num_px * num_px * 3, 1) b -- bias, a scalar X -- data of shape (num_px * num_px * 3, number of examples) Y -- true "label" vector (containing 0 if male celebrity, 1 if female celebrity) of size (1, number of examples) num_iterations -- number of iterations of the optimization loop learning_rate -- learning rate of the gradient descent update rule print_cost -- True to print the loss every 100 steps Returns: params -- dictionary containing the weights w and bias b grads -- dictionary containing the gradients of the weights and bias with respect to the cost function costs -- list of all the costs computed during the optimization, this will be used to plot the learning curve. """ costs = [] for i in range(num_iterations): #This will iterate i from 0 till num_iterations-1 # Cost and gradient calculation grads, cost = propagate(w, b, X, Y) # Retrieve derivatives from grads dw = grads["dw"] db = grads["db"] # update rule w = w-learning_rate*dw b = b-learning_rate*db # Record the costs for every 100th iteration if i % 100 == 0: costs.append(cost) # Print the cost every 100 training examples if print_cost and i % 100 == 0: print ("Cost after iteration %i: %f" %(i, cost)) # plot the cost plt.rcParams['figure.figsize'] = (10.0, 10.0) plt.plot(np.squeeze(costs)) plt.ylabel('cost') plt.xlabel('iterations (per hundreds)') plt.title("Learning rate =" + str(learning_rate)) plt.show() params = {"w": w, "b": b} grads = {"dw": dw, "db": db} return params, grads, costs Prediction using learned parametersFrom the previous function we will get the final weights and bias. We can use those weights to predict the target variable(gender) on new data(test data). Let's define a function for prediction capability. If the predicted probability is 0.5 or less, the image will be calssified as 0(male) else 1(female).def predict(w, b, X): ''' Predict whether the label is 0 or 1 using learned logistic regression parameters (w, b) Arguments: w -- weights, a numpy array of size (num_px * num_px * 3, 1) b -- bias, a scalar X -- data of size (num_px * num_px * 3, number of examples) Returns: Y_prediction -- a numpy array (vector) containing all predictions (0/1) for the examples in X ''' m = X.shape[1] Y_prediction = np.zeros((1,m)) #w = w.reshape(X.shape[0], 1) # Compute vector "A" predicting the probabilities of having a female celebrity in the picture A = sigmoid(np.dot(w,X)+b) Y_prediction=np.round(A) assert(Y_prediction.shape == (1, m)) return Y_prediction Putting everything togetherLet's put training and prediction into a sigle function called model, which will train the model on training data and predict on testing data and return accuracy of the model. Since we have to predict 0 or 1, we can calculate accuray using the formula:It indicates what percentage of images have been rightly classified or predicted. You can define any accuracy or evaluation metrics. However, in this series we will use accuracy defined as above.def model(X_train, Y_train, X_test, Y_test, num_iterations = 2000, learning_rate = 0.5, print_cost = False): """ Builds the logistic regression model by calling the function implemented previously Arguments: X_train -- training set represented by a numpy array of shape (num_px * num_px * 3, m_train) Y_train -- training labels represented by a numpy array (vector) of shape (1, m_train) X_test -- test set represented by a numpy array of shape (num_px * num_px * 3, m_test) Y_test -- test labels represented by a numpy array (vector) of shape (1, m_test) num_iterations -- hyperparameter representing the number of iterations to optimize the parameters learning_rate -- hyperparameter representing the learning rate used in the update rule of optimize() print_cost -- Set to true to print the cost every 100 iterations Returns: d -- dictionary containing information about the model. """ # initialize parameters with zeros m_train=X_train.shape[0] w, b = initialize_with_zeros(m_train)# Gradient descent parameters, grads, costs = optimize(w, b, X_train, Y_train, num_iterations= num_iterations, learning_rate = learning_rate, print_cost = print_cost) # Retrieve parameters w and b from dictionary "parameters" w = parameters["w"] b = parameters["b"] # Predict test/train set examples Y_prediction_test = predict(w, b, X_test) Y_prediction_train = predict(w, b, X_train)# Print train/test Errors print("train accuracy: {} %".format(100*(1 - np.mean(np.abs(Y_prediction_train - Y_train)) ))) print("test accuracy: {} %".format(100*(1 - np.mean(np.abs(Y_prediction_test - Y_test)) ))) d = {"costs": costs, "Y_prediction_test": Y_prediction_test, "Y_prediction_train" : Y_prediction_train, "w" : w, "b" : b, " learning_rate" : learning_rate, "num_iterations": num_iterations} return d<Python Code End><Python Code Start>d = model(train_x, y_train, test_x, y_test, num_iterations = 1000, learning_rate = 0.005, print_cost = True) Output:Cost after iteration 0: 0.693147 Cost after iteration 100: 0.325803 Cost after iteration 200: 0.209219 Cost after iteration 300: 0.159637 Cost after iteration 400: 0.128275 Cost after iteration 500: 0.106781 Cost after iteration 600: 0.091209 Cost after iteration 700: 0.079450 Cost after iteration 800: 0.070282 Cost after iteration 900: 0.062948 train accuracy: 100.0 % test accuracy: 65.0 % The accuracy of the model is around 65% with learning rate =0.005 and number of iterations =1000. Probably we can achieve bit more better results by tuning these two parameters.Now, let's take a look at the mis labeled or wrongly predicted images.def print_mislabeled_images(classes, X, y, p): """ Plots images where predictions and truth were different. X -- dataset y -- true labels p -- predictions """ a = p + y mislabeled_indices = np.asarray(np.where(a == 1)) plt.rcParams['figure.figsize'] = (40.0, 40.0) # set default size of plots num_images = len(mislabeled_indices[0]) for i in range(num_images): index = mislabeled_indices[1][i] plt.subplot(2, num_images, i + 1) plt.imshow(X[:,index].reshape(64,64,3), interpolation='sinc') plt.axis('off') plt.rc('font', size=20) plt.title("Prediction: " + classes[int(p[0,index])] + " \n Class: " + classes[y[0,index]])print_mislabeled_images(classes, test_x, y_test, d["Y_prediction_test"]) Output:So now we have completed training a single node neural network. We have achieved an accuracy of 65 %. Not bad for a single neuron or simple logistic regression. It's a bit long post but understanding the basics is the key to understand more complex algorithms. Sigmoid function(or similar functions) is the building block for Neural Networks, Deep learning and AI. I hope this article gave a good intuition about the sigmoid function and neural network approach.Building on top of this article, in the next post, I will talk about how to train a multi layer neural network. Before wrapping up, I will try to show what the neuron has learned at the end of training. Now this part is not for weak hearted people. Continue only if you are brave and curious :D.Let's use the final weights to multiply corresponding pixels in training data and scale by a factor 255, since we divided pixels by 255 for standardization.Now let's plot an image from the reconstructed data.test=d["w"].T*train_x*255test=test.T.reshape(80,64,64,3)plt.rcParams['figure.figsize'] = (10.0, 10.0)plt.imshow(test[0], interpolation='sinc')Output:You may either find the image artistic or scary or weird. Neverthless it's still very interesting, atleast for me ;). For plotting the above image I used sinc interpolation. We can try different interpolations and see the effects. methods = [None, 'none', 'nearest', 'bilinear', 'bicubic', 'spline16', 'spline36', 'hanning', 'hamming', 'hermite', 'kaiser', 'quadric', 'catrom', 'gaussian', 'bessel', 'mitchell', 'sinc', 'lanczos']# Fixing random state for reproducibility np.random.seed(19680801) fig, axes = plt.subplots(3, 6, figsize=(24, 12), subplot_kw={'xticks': [], 'yticks': []})fig.subplots_adjust(hspace=0.3, wspace=0.05)for ax, interp_method in zip(axes.flat, methods): plt.rc('font', size=15) ax.imshow(test[0], interpolation=interp_method, cmap=None) ax.set_title(interp_method)plt.show()Output: Let's create a montage and compare the reconstructed images vs original.def montage(images, saveto='montage.png'): """Draw all images as a montage separated by 1 pixel borders. Also saves the file to the destination specified by `saveto`. Parameters ---------- images : numpy.ndarray Input array to create montage of. Array should be: batch x height x width x channels. saveto : str Location to save the resulting montage image. Returns ------- m : numpy.ndarray Montage image. """ if isinstance(images, list): images = np.array(images) img_h = images.shape[1] img_w = images.shape[2] n_plots = int(np.ceil(np.sqrt(images.shape[0]))) if len(images.shape) == 4 and images.shape[3] == 3: m = np.ones( (images.shape[1] * n_plots + n_plots + 1, images.shape[2] * n_plots + n_plots + 1, 3)) * 0.5 else: m = np.ones( (images.shape[1] * n_plots + n_plots + 1, images.shape[2] * n_plots + n_plots + 1)) * 0.5 for i in range(n_plots): for j in range(n_plots): this_filter = i * n_plots + j if this_filter < images.shape[0]: this_img = images[this_filter] m[1 + i + i * img_h:1 + i + (i + 1) * img_h, 1 + j + j * img_w:1 + j + (j + 1) * img_w] = this_img #plt.imsave(arr=m, fname=saveto) return m The above function can be used to create montages. Now let's combine some of the reconstructed images and original data and create a montage.compare = np.concatenate((test[52:54], data[52:54]), axis=0) compare.shape Output:(4, 64, 64, 3) Now let us try to create the montage with two different interpolations.plt.imshow(montage(compare,saveto='montage.png'),interpolation='spline36')plt.show()plt.imshow(montage(compare,saveto='montage.png'),interpolation='bicubic')plt.show()Output:If you look carefully, in the reconstructed image, hair colors of the image have been captured differently. This is an indication that the algorithm has learned some of the facial features from the data.Also, other thing we can do is to generate the montage with different interpolations for comparison.methods = [None, 'none', 'nearest', 'bilinear', 'bicubic', 'spline16', 'spline36', 'hanning', 'hamming', 'hermite', 'kaiser', 'quadric', 'catrom', 'gaussian', 'bessel', 'mitchell', 'sinc', 'lanczos']# Fixing random state for reproducibility np.random.seed(19680801) fig, axes = plt.subplots(3, 6, figsize=(24, 12), subplot_kw={'xticks': [], 'yticks': []})fig.subplots_adjust(hspace=0.3, wspace=0.05)for ax, interp_method in zip(axes.flat, methods): ax.imshow(montage(compare,saveto='montage.png'), interpolation=interp_method, cmap=None) ax.set_title(interp_method)plt.show() Output: Images are very interesting. We can find very interesting patterns and visulaize how an algorithm learns to identify patterns in the image. It always amazes me. On that note I am putting my pen down on this article. In the next article, I will talk about multi layer neural networks and try to explore what the neurons have learned from the images. References: 'Neural Networks and Deep Learning' on Coursera by Andrew Ng Calculus by Gilbert StrangSee More

]]>

]]>

In this series, I will talk about training a simple neural network on image data. To give a brief overview, neural networks is a kind of supervised learning. By this I mean, the model needs to train on historical data to understand the relationship between input variables and target variables. Once trained, the model can be used to predict target variable on new input data. In the previous posts, we have written about linear, lasso and ridge regression. All those methods come under supervised learning. But what is special about neural networks is, it works really well for image, audio, video and language datasets. A multilayer neural network and its variations are commonly called deep learning.In this blog, I will focus on handling and processing the image data. In the next blog, I will show how to train the model.I will use python for implementation as python as many useful functions for image processing. If you are new to python, I recommend you to quickly take a numpy (till array manipulation) and matplotlib tutorial.Main contents of this article:a) Exploring image dataset: Reading images, printing image arrays and actual images, decomposing images into different color channelsb) Cropping and Resizing images: Cropping rectangle images to a square, resizing high resolution images to a lower resolution for easier processing, creating gray scale images from color images and standardizing image datac) Colormapping and Interpolation: Converting no color channel images to color images using different themes. Interpolating after resizing or reducing resolution of images to for retaining quality and information.d) Montage Creation and Preparing image data for modeling I originally used Jupyter notebook for the article. I am not sure how to upload it here. Hence each cell is shown between <Python code start> and <Python code end>. Apologies if the format is looking a little weird.Okay! Let's get started. First let's get some data set. This data is shared in the course on kadanze about Creative Applications of Deeplearning by Parag Mital. This data contains pictures of celebrities. Original source can be found here along with some description here. In the original dataset, there are around 200,000 pictures. For the purpose of this blog we will use 100 images. Following code will download 100 images.<Python code start> # Load the os library import os# Load the request module import urllib.requestif not os.path.exists('img_align_celeba'):# Create a directory os.mkdir('img_align_celeba')# Now perform the following 100 times: for img_i in range(1, 101):# create a string using the current loop counter f = '000%03d.jpg' % img_i# and get the url with that string appended the end url = 'https://s3.amazonaws.com/cadl/celeb-align/' + f# We'll print this out to the console so we can see how far we've gone print(url, end='\r')# And now download the url to a location inside our new directory urllib.request.urlretrieve(url, os.path.join('img_align_celeba', f)) else: print('Celeb Net dataset already downloaded')<Python code end>If the dataset was not downloaded, the above code will download it. Let's read the downloaded images. <Python code start> files = [os.path.join('img_align_celeba', file_i) for file_i in os.listdir('img_align_celeba') if '.jpg' in file_i]<Python code end>Let's add a target column. Here we will try to identify whether a picture shows a male celebrity or female celebrity. Value 1 denotes 'Female celebrity' and 0 denotes 'male celebrity'. <Python code start> y=np.array([1,1,0,1,1,1,0,0,1,1,1,0,0,1,0,0,1,1,1,0,0,1,0,1,0,1,1,1,1,0,1,0,0,1,1,0,0,0,1,1,0,1,1,1,1,1,1,0,0,0,0,0,0,1,0,0,1,1,1,0,0,1,1,0,0,1,0,0,0,0,1,0,1,1,1,0,1,1,0,0,0,0,1,1,1,1,1,1,1,0,0,1,1,1,1,1,1,1,1,1]) y=y.reshape(1,y.shape[0]) classes=np.array(['Male', 'Female']) y_train=y[:,:80] y_test=y[:,80:]<Python code end>Now let's take a closer look at our data set. For this we will use the matplotlib library for plotting. We can also use it to view the images in our data <Python code start> import matplotlib.pyplot as plt %matplotlib inline<Python code end>Exploring image dataset In this section we will try to understand our image data better. Let's plot an image from the data set(in this case the first image) <Python code start> plt.imread(files[0])<Python code start> Output: array([[[253, 231, 194], [253, 231, 194], [253, 231, 194], ..., [247, 226, 225], [254, 238, 222], [254, 238, 222]],[[253, 231, 194], [253, 231, 194], [253, 231, 194], ..., [249, 228, 225], [254, 238, 222], [254, 238, 222]],[[253, 231, 194], [253, 231, 194], [253, 231, 194], ..., [250, 231, 227], [255, 239, 223], [255, 239, 223]],..., [[140, 74, 26], [116, 48, 1], [146, 78, 33], ..., [122, 55, 28], [122, 56, 30], [122, 56, 30]],[[130, 62, 15], [138, 70, 23], [166, 98, 53], ..., [118, 49, 20], [118, 51, 24], [118, 51, 24]],[[168, 100, 53], [204, 136, 89], [245, 177, 132], ..., [118, 49, 20], [120, 50, 24], [120, 50, 24]]], dtype=uint8)It prints some numbers. Here each tuple (the innermost array) represents a pixel. As you can see, it has 3 values, one corresponding to each color channel, RGB (Red, Green Blue). To view the data as image, we have to use 'imshow' function <Python code start> img = plt.imread(files[0]) plt.imshow(img)<Python code end> Output: Let's see the shape(dimensions) of the image <Python code start> img.shape<Python code end> Output: (218, 178, 3)This means height of the image is 218 pixels,width 178 pixels and each pixel has 3 color channels(RGB). We can view the image using each of the color channels <Python code start> plt.figure() plt.imshow(img[:, :, 0]) plt.figure() plt.imshow(img[:, :, 1]) plt.figure() plt.imshow(img[:, :, 2])<Python code end> Output: Cropping and resizing images: For many of the deeplearning and image processing applications, we will need to crop the image to a square and resize it for faster processing. The following function will crop any rectangle image(height != width) to a square image <Python code start> def imcrop_tosquare(img): if img.shape[0] > img.shape[1]: extra = (img.shape[0] - img.shape[1]) // 2 crop = img[extra:-extra, :] elif img.shape[1] > img.shape[0]: extra = (img.shape[1] - img.shape[0]) // 2 crop = img[:, extra:-extra] else: crop = img return crop<Python code end>Now we will resize the image to 64 by 64 pixels(height=64,width=64). For resizing, we can use the imresize function from scipy <Python code start> from scipy.misc import imresize square = imcrop_tosquare(img) rsz = imresize(square, (64, 64)) plt.imshow(rsz) print(rsz.shape)<Python code end>Output:(64, 64, 3)As we can see from the shape of the image, it has been resized to (64,64,3). If we take the mean of each color channels(RGB), we will get a grayscale image. <Python code start> mean_img = np.mean(rsz, axis=2) print(mean_img.shape) plt.imshow(mean_img, cmap='gray')<Python code end>Output:(64, 64)Colormapping and interpolating imagesWhen there is no color channel for an image, you can use different availble color maps provided by matplotlib. Following code iterates through different color maps for the above image(with no color channel) and plots it. For your purposes, you can choose the best one if you come across such images. It is also an easy way to convert grayscale images to color.<Python code start>methods = [None,'viridis', 'plasma', 'inferno', 'magma','Greys', 'Purples', 'Blues', 'Greens', 'Oranges', 'Reds', 'YlOrBr', 'YlOrRd', 'OrRd', 'PuRd', 'RdPu', 'BuPu','GnBu', 'PuBu', 'YlGnBu', 'PuBuGn', 'BuGn', 'YlGn', 'binary', 'gist_yarg', 'gist_gray', 'gray', 'bone', 'pink', 'spring', 'summer', 'autumn', 'winter', 'cool', 'Wistia', 'hot', 'afmhot', 'gist_heat', 'copper','PiYG', 'PRGn', 'BrBG', 'PuOr', 'RdGy', 'RdBu','RdYlBu', 'RdYlGn', 'Spectral', 'coolwarm', 'bwr', 'seismic','Pastel1', 'Pastel2', 'Paired', 'Accent','Dark2', 'Set1', 'Set2', 'Set3', 'tab10', 'tab20', 'tab20b', 'tab20c','flag', 'prism', 'ocean', 'gist_earth', 'terrain', 'gist_stern','gnuplot', 'gnuplot2', 'CMRmap', 'cubehelix', 'brg', 'hsv','gist_rainbow', 'rainbow', 'jet', 'nipy_spectral', 'gist_ncar']# Fixing random state for reproducibility np.random.seed(19680801) fig, axes = plt.subplots(10, 8, figsize=(16, 32), subplot_kw={'xticks': [], 'yticks': []})fig.subplots_adjust(hspace=0.3, wspace=0.05)for ax, interp_method in zip(axes.flat, methods): ax.imshow(mean_img, interpolation='sinc',cmap=interp_method) ax.set_title(interp_method)<Python code end>Output: Now let's crop all the rectangle images as square and resize all the images in the dataset to size 64, 64, 3. <Python code start> imgs = [] for file_i in files: img = plt.imread(file_i) square = imcrop_tosquare(img) rsz = imresize(square, (64, 64)) imgs.append(rsz) print(len(imgs))<Python code end>Output:100Let's combine all the images into a variable <Python code start> data = np.array(imgs)<Python code end>If you are familiar with machine learning, you will know the about data standardization. It means generally bringing down the range of an input variable. Same can be done with image data as well. However, for images there is an easy way to standardize. We can simply divide each of the values by 255, as each pixel can have values from 0-255. This will change the scale from 0-255 to 0-1. This will make sure while taking exponents in logistic regression, we won't overflow the system. <Python code start> data=data/255 plt.imshow(data[0])<Python code end> Output:When we reduce the resolution, we lose some information. We can use different kinds of interpolation to overcome this. The following code shows the effect of different kinds of interpolation. While plotting images after resizing, you can choose any interpolation you like.<Python code start>methods = [None, 'none', 'nearest', 'bilinear', 'bicubic', 'spline16', 'spline36', 'hanning', 'hamming', 'hermite', 'kaiser', 'quadric', 'catrom', 'gaussian', 'bessel', 'mitchell', 'sinc', 'lanczos']# Fixing random state for reproducibility np.random.seed(19680801) fig, axes = plt.subplots(3, 6, figsize=(24, 12), subplot_kw={'xticks': [], 'yticks': []})fig.subplots_adjust(hspace=0.3, wspace=0.05)for ax, interp_method in zip(axes.flat, methods): ax.imshow(data[0], interpolation=interp_method, cmap=None) ax.set_title(interp_method)plt.show()<Python code end>Output: <Python code start> data.shape<Python code end> Output: (100, 64, 64, 3)The shape of the data is 100,64,64,3. This means there are 100 images of size(64,64,3)Montage creation Till now we have been inspecting one image at a time. To view all images, we can use the following function to create a montage of all the images <Python code start> def montage(images, saveto='montage.png'): """Draw all images as a montage separated by 1 pixel borders. Also saves the file to the destination specified by `saveto`. Parameters ---------- images : numpy.ndarray Input array to create montage of. Array should be: batch x height x width x channels. saveto : str Location to save the resulting montage image. Returns ------- m : numpy.ndarray Montage image. """ if isinstance(images, list): images = np.array(images) img_h = images.shape[1] img_w = images.shape[2] n_plots = int(np.ceil(np.sqrt(images.shape[0]))) if len(images.shape) == 4 and images.shape[3] == 3: m = np.ones( (images.shape[1] * n_plots + n_plots + 1, images.shape[2] * n_plots + n_plots + 1, 3)) * 0.5 else: m = np.ones( (images.shape[1] * n_plots + n_plots + 1, images.shape[2] * n_plots + n_plots + 1)) * 0.5 for i in range(n_plots): for j in range(n_plots): this_filter = i * n_plots + j if this_filter < images.shape[0]: this_img = images[this_filter] m[1 + i + i * img_h:1 + i + (i + 1) * img_h, 1 + j + j * img_w:1 + j + (j + 1) * img_w] = this_img plt.imsave(arr=m, fname=saveto) return m plt.figure(figsize=(10, 10)) plt.imshow(montage(imgs,saveto='montage.png').astype(np.uint8))<Python code end> Output: Data Preparation for modeling Let's split the data into train and test. Train data will be used to train the model. Then we will predict on test data to check the accuracy of the trained model <Python code start> train_x_orig=data[:80,:,:,:] test_x_orig=data[80:,:,:,:]<Python code end>Unlike regression model, logistic regression is used to predict a binomial variable.i.e means a variable which takes only two values. This suits perfectly for us as we are trying to predict from the images whether it is a male or female celebrity. <Python code start> m_train = train_x_orig.shape[0] m_test = y_test.shape[1] num_px = train_x_orig.shape[1]print ("Number of training examples: m_train = " + str(m_train)) print ("Number of testing examples: m_test = " + str(m_test)) print ("Height/Width of each image: num_px = " + str(num_px)) print ("Each image is of size: (" + str(num_px) + ", " + str(num_px) + ", 3)") print ("train_x shape: " + str(train_x_orig.shape)) print ("y_train shape: " + str(y_train.shape)) print ("test_x shape: " + str(test_x_orig.shape)) print ("y_test shape: " + str(y_test.shape))<Python code end>Output:Number of training examples: m_train = 80 Number of testing examples: m_test = 20 Height/Width of each image: num_px = 64 Each image is of size: (64, 64, 3) train_x shape: (80, 64, 64, 3) y_train shape: (1, 80) test_x shape: (20, 64, 64, 3) y_test shape: (1, 20)For the purpose of training, we have to reshape our data or flatten our data. After flattening, shape of our data should become (height * width * 3, number of examples). After flattening, every column will represent an image. <Python code start> train_x = train_x_orig.reshape(train_x_orig.shape[0],-1).T test_x = test_x_orig.reshape(test_x_orig.shape[0],-1).T print ("train_x flatten shape: " + str(train_x.shape)) print ("test_x flatten shape: " + str(test_x.shape))<Python code end>Output:train_x flatten shape: (12288, 80) test_x flatten shape: (12288, 20)Now we have our data set ready. In the next part I will talk about how to train the model using simple logistic regression using gradient descent. Meanwhile If you want to know more about gradient descent check it out hereTo read original article,click here.-Jobil Louis JosephReferences: 'Creative Applications of Deep Learning with Tensorflow' on Kadenze by Parag Mital 'Neural Networks and Deep Learning' on Coursera by Andrew Ng.See More

]]>

]]>