Code a Neural Network in plain NumPy: Part 1 (with no Hidden Layer)

You will learn to:

``````Build the general architecture of a learning algorithm, including:
Initializing parameters
Calculating the cost function and its gradient
Using an optimization algorithm (gradient descent)

Gather all three functions above into a main model function, in the right order.``````

Overview of the Problem set

Problem Statement: You are given a dataset containing:

``````- a training set of m_train images labeled as cat (y=1) or non-cat (y=0)
- a test set of m_test images labeled as cat or non-cat
- each image is of shape (num_px, num_px, 3) where 3 is for the 3 channels (RGB). Thus, each image is square (height = num_px) and (width = num_px).
``````

You will build a simple image-recognition algorithm that can correctly classify pictures as cat or non-cat.

Show me the code

```import numpy as np
import matplotlib.pyplot as plt
import h5py
import scipy
from PIL import Image
from scipy import ndimage
from lr_utils import load_dataset # for this lib git link is provided
%matplotlib inline```

Let’s get more familiar with the dataset.In [183]:

```# Loading the data (cat/non-cat)
train_set_x_orig, train_set_y, test_set_x_orig, test_set_y, classes = load_dataset()```

In [28]:

`train_set_x_orig.shape`

Out[28]:

`(209, 64, 64, 3)`

In [29]:

`train_set_y.shape`

Out[29]:

`(1, 209)`

In [21]:

`classes`

Out[21]:

`array([b'non-cat', b'cat'], dtype='|S7')`

In [184]:

```# Example of a picture
index = 200
plt.imshow(train_set_x_orig[index])```

Out[184]:

`<matplotlib.image.AxesImage at 0x22c6a3ba240>`

Exercise: 1

Find the values for:

• m_train (number of training examples)
• m_test (number of test examples)
• num_px (= height = width of a training image) Remember that train_set_x_orig is a numpy-array of shape (m_train, num_px, num_px, 3). For instance, you can access m_train by writing train_set_x_orig.shape[0].

In [186]:

```m_train = train_set_x_orig.shape[0]
m_test = test_set_x_orig.shape[0]
num_px = train_set_x_orig.shape[1]

print ("Number of training examples: m_train = " + str(m_train))
print ("Number of testing examples: m_test = " + str(m_test))
print ("Height/Width of each image: num_px = " + str(num_px))
print ("Each image is of size: (" + str(num_px) + ", " + str(num_px) + ", 3)")
print ("train_set_x shape: " + str(train_set_x_orig.shape))
print ("train_set_y shape: " + str(train_set_y.shape))
print ("test_set_x shape: " + str(test_set_x_orig.shape))
print ("test_set_y shape: " + str(test_set_y.shape))```
```Number of training examples: m_train = 209
Number of testing examples: m_test = 50
Height/Width of each image: num_px = 64
Each image is of size: (64, 64, 3)
train_set_x shape: (209, 64, 64, 3)
train_set_y shape: (1, 209)
test_set_x shape: (50, 64, 64, 3)
test_set_y shape: (1, 50)```

Exercise:2

Standardize and Reshape the training and test data sets so that images of size (num_px, num_px, 3) are flattened into single vectors of shape (num_px ∗ num_px ∗ 3, 1).In [66]:

`train_set_x_flatten =train_set_x_orig.reshape(train_set_x_orig.shape[1]*train_set_x_orig.shape[1]*3,train_set_x_orig.shape[0])`

In [67]:

`train_set_x_flatten .shape`

Out[67]:

`(12288, 209)`

In [54]:

`test_set_x_flatten =test_set_x_orig.reshape(test_set_x_orig.shape[1]*test_set_x_orig.shape[1]*3,test_set_x_orig.shape[0])`

In [56]:

`test_set_x_flatten.shape`

Out[56]:

`(12288, 50)`

To represent color images, the red, green and blue channels (RGB) must be specified for each pixel, and so the pixel value is actually a vector of three numbers ranging from 0 to 255. Let’s standardize our dataset.In [69]:

```train_set_x = train_set_x_flatten/255.
test_set_x = test_set_x_flatten/255.```

General Architecture of the learning algorithm

• Initialize the parameters of the model
• Learn the parameters for the model by minimizing the cost
• Use the learned parameters to make predictions (on the test set)
• Analyse the results and conclude

Building the parts of our algorithm

The main steps for building a Neural Network are: Define the model structure (such as number of input features) Initialize the model’s parameters Loop: Calculate current loss (forward propagation) Calculate current gradient (backward propagation) Update parameters (gradient descent) You often build 1-3 separately and integrate them into one function we call model().

Exercise:3

Implement sigmoid() function .In [188]:

```def sigmoid(z):
return 1/(1+np.exp(-z))```

In [189]:

`print ("sigmoid([0, 2]) = " + str(sigmoid(np.array([0,2]))))`
`sigmoid([0, 2]) = [0.5        0.88079708]`

Exercise:4

Implement parameter initialization. You have to initialize w as a vector of zeros. If you don’t know what numpy function to use, look up np.zeros() in the Numpy library’s documentation.In [192]:

```def initialize_with_zeros(dim):
w=np.zeros((dim,1))
b=0
return w,b```

In [193]:

```dim = 5
w, b = initialize_with_zeros(dim)
print ("w = " + str(w))
print ("b = " + str(b))```
```w = [[0.]
[0.]
[0.]
[0.]
[0.]]
b = 0```

Exercise:5

Implement a function propagate() that computes the cost function and its gradient.In [194]:

```def propagate(w, b, X, Y):
m=X.shape[1]
A=sigmoid((np.dot(w.T,X) + b))
yloga=np.multiply(Y,np.log(A))
ylogaa=np.multiply((1-Y),np.log(1-A))
cost=(-1/m)*(np.sum(yloga+ylogaa))
dw=(1/m)*(np.dot(X,(A-Y).T))
db=(1/m)*(np.sum(A-Y))
assert(dw.shape == w.shape)
assert(db.dtype == float)
cost = np.squeeze(cost)
assert(cost.shape == ())
"db": db}

In [195]:

```w, b, X, Y = np.array([[1.],[2.]]), 2., np.array([[1.,2.,-1.],[3.,4.,-3.2]]), np.array([[1,0,1]])
grads, cost = propagate(w, b, X, Y)
print ("dw = " + str(grads["dw"]))
print ("db = " + str(grads["db"]))
print ("cost = " + str(cost))```
```dw = [[0.99845601]
[2.39507239]]
db = 0.001455578136784208
cost = 5.801545319394553```

Optimization

You have initialized your parameters. You are also able to compute a cost function and its gradient. Now, you want to update the parameters using gradient descent.

Exercise: 6

Write down the optimization function. The goal is to learn w and b by minimizing the cost function J . For a parameter θ ,the update rule is θ=θ−α*dθ ,where α is the learning rate.In [196]:

```def optimize(w, b, X, Y, num_iterations, learning_rate, print_cost = False):
costs=[]
for i in range(num_iterations):

w=w-np.multiply(learning_rate,dw)
b=b-np.multiply(learning_rate,db)

costs.append(cost)

# Record the costs
if i % 100 == 0:
costs.append(cost)

# Print the cost every 100 training iterations
if print_cost and i % 100 == 0:
print ("Cost after iteration %i: %f" %(i, cost))

params = {"w": w,
"b": b}
"db": db}

In [197]:

```params, grads, costs = optimize(w, b, X, Y, num_iterations= 100, learning_rate = 0.009, print_cost = False)

print ("w = " + str(params["w"]))
print ("b = " + str(params["b"]))
print ("dw = " + str(grads["dw"]))
print ("db = " + str(grads["db"]))```
```w = [[0.19033591]
[0.12259159]]
b = 1.9253598300845747
dw = [[0.67752042]
[1.41625495]]
db = 0.21919450454067652```

Exercise: 7

We are able to learned w and b for a dataset X. Now Implement the predict() function.In [147]:

```def predict(w, b, X):
m = X.shape[1]
Y_prediction = np.zeros((1,m))
A=sigmoid(np.dot(w.T,X)+b)
for i in range(A.shape[1]):
if A[0,i] > 0.5:
Y_prediction[0][i] = 1
else:
Y_prediction[0][i] = 0
assert(Y_prediction.shape == (1, m))
return Y_prediction ```

In [148]:

```w = np.array([[0.1124579],[0.23106775]])
b = -0.3
X = np.array([[1.,-1.1,-3.2],[1.2,2.,0.1]])
print ("predictions = " + str(predict(w, b, X)))```
`predictions = [[1. 1. 0.]]`

Merge all functions into a model

You will now see how the overall model is structured by putting together all the building blocks (functions implemented in the previous parts) together, in the right order.

Exercise: Implement the model function. Use the following notation:

• Y_prediction_test for your predictions on the test set
• Y_prediction_train for your predictions on the train set
• w, costs, grads for the outputs of optimize()

In [198]:

```def model(X_train, Y_train, X_test, Y_test, num_iterations = 2000, learning_rate = 0.5, print_cost = False):
dim=X_train.shape[0]
w,b=initialize_with_zeros(dim)
params, grads, costs = optimize(w, b, X_train, Y_train, num_iterations= num_iterations, learning_rate= learning_rate, print_cost= True)

Y_prediction_train=predict(params["w"], params["b"], X_train)
Y_prediction_test=predict(params["w"],params["b"], X_test)

print("train accuracy: {} %".format(100 - np.mean(np.abs(Y_prediction_train - Y_train)) * 100))
print("test accuracy: {} %".format(100 - np.mean(np.abs(Y_prediction_test - Y_test)) * 100))

d = {"costs": costs,
"Y_prediction_test": Y_prediction_test,
"Y_prediction_train" : Y_prediction_train,
"w" : params["w"],
"b" : params["b"],
"learning_rate" : learning_rate,
"num_iterations": num_iterations}

return d```

In [203]:

`d=model(train_set_x, train_set_y, test_set_x, test_set_y, num_iterations = 2000, learning_rate = 0.005, print_cost = True)`
```Cost after iteration 0: 0.693147
Cost after iteration 100: 0.709726
Cost after iteration 200: 0.657712
Cost after iteration 300: 0.614611
Cost after iteration 400: 0.578001
Cost after iteration 500: 0.546372
Cost after iteration 6: 0.518331
Cost after iteration 700: 0.492852
Cost after iteration 800: 0.469259
Cost after iteration 900: 0.447139
Cost after iteration 1000: 0.426262
Cost after iteration 1100: 0.406617
Cost after iteration 1200: 0.388723
Cost after iteration 1300: 0.374678
Cost after iteration 1400: 0.365826
Cost after iteration 1500: 0.358532
Cost after iteration 1600: 0.351612
Cost after iteration 1700: 0.345012
Cost after iteration 1800: 0.338704
Cost after iteration 1900: 0.332664
train accuracy: 91.38755980861244 %
test accuracy: 34.0 %```

In [162]:

```# Plot learning curve (with costs)
costs = np.squeeze(d['costs'])
plt.plot(costs)
plt.ylabel('cost')
plt.xlabel('iterations (per hundreds)')
plt.title("Learning rate =" + str(d["learning_rate"]))
plt.show()```

In [165]:

`classes`

Out[165]:

`array([b'non-cat', b'cat'], dtype='|S7')`

In [175]:

```# Example of a picture that was wrongly classified.
index = 2
plt.imshow(test_set_x[:,index].reshape((num_px, num_px, 3)))
# print ("y = " + str(test_set_y[0,index]) + ", you predicted that it is a \"" + classes[d["Y_prediction_test"][0,index]].decode("utf-8") +  "\" picture.")```

Out[175]:

`<matplotlib.image.AxesImage at 0x22c6a2d2f60>`

In [ ]:

```learning_rates = [0.01, 0.001, 0.0001]
models = {}
for i in learning_rates:
print ("learning rate is: " + str(i))
models[str(i)] = model(train_set_x, train_set_y, test_set_x, test_set_y, num_iterations = 1500, learning_rate = i, print_cost = False)
print ('\n' + "-------------------------------------------------------" + '\n')

for i in learning_rates:
plt.plot(np.squeeze(models[str(i)]["costs"]), label= str(models[str(i)]["learning_rate"]))

plt.ylabel('cost')
plt.xlabel('iterations (hundreds)')

frame = legend.get_frame()
frame.set_facecolor('0.90')
plt.show()
```
```learning rate is: 0.01
Cost after iteration 0: 0.693147
Cost after iteration 100: 2.321788
Cost after iteration 200: 3.011239
Cost after iteration 300: 0.483519
Cost after iteration 400: 1.297533
Cost after iteration 500: 1.215430
Cost after iteration 600: 1.135770
Cost after iteration 700: 0.901737
Cost after iteration 800: 0.821976
Cost after iteration 900: 0.791033
Cost after iteration 1000: 0.762400
Cost after iteration 1100: 0.736228
Cost after iteration 1200: 0.711983
Cost after iteration 1300: 0.689076
Cost after iteration 1400: 0.667013
train accuracy: 71.29186602870814 %
test accuracy: 64.0 %

-------------------------------------------------------

learning rate is: 0.001
Cost after iteration 0: 0.693147
Cost after iteration 100: 0.605784
Cost after iteration 200: 0.589938
Cost after iteration 300: 0.577890
Cost after iteration 400: 0.567791
Cost after iteration 500: 0.559013
Cost after iteration 600: 0.551207
Cost after iteration 700: 0.544146
Cost after iteration 800: 0.537671
Cost after iteration 900: 0.531668
Cost after iteration 1000: 0.526054
Cost after iteration 1100: 0.520764
Cost after iteration 1200: 0.515752
Cost after iteration 1300: 0.510979
Cost after iteration 1400: 0.506416
train accuracy: 74.16267942583733 %
test accuracy: 34.0 %

-------------------------------------------------------

learning rate is: 0.0001
Cost after iteration 0: 0.693147
Cost after iteration 100: 0.636292
Cost after iteration 200: 0.630322
Cost after iteration 300: 0.625487
Cost after iteration 400: 0.621470
Cost after iteration 500: 0.618051
Cost after iteration 600: 0.615075
Cost after iteration 700: 0.612432
Cost after iteration 800: 0.610042
Cost after iteration 900: 0.607850
Cost after iteration 1000: 0.605814
Cost after iteration 1100: 0.603904
Cost after iteration 1200: 0.602098
Cost after iteration 1300: 0.600377
Cost after iteration 1400: 0.598731```
`lr_utils.py code`
``````import numpy as np
import h5py

train_dataset = h5py.File('datasets/train_catvnoncat.h5', "r")
train_set_x_orig = np.array(train_dataset["train_set_x"][:]) # your train set features
train_set_y_orig = np.array(train_dataset["train_set_y"][:]) # your train set labels
test_dataset = h5py.File('datasets/test_catvnoncat.h5', "r")
test_set_x_orig = np.array(test_dataset["test_set_x"][:]) # your test set features
test_set_y_orig = np.array(test_dataset["test_set_y"][:]) # your test set labels

classes = np.array(test_dataset["list_classes"][:]) # the list of classes

train_set_y_orig = train_set_y_orig.reshape((1, train_set_y_orig.shape[0]))
test_set_y_orig = test_set_y_orig.reshape((1, test_set_y_orig.shape[0]))

return train_set_x_orig, train_set_y_orig, test_set_x_orig, test_set_y_orig, classes``````

REFERENCES:

coursera.org
deep learning specialization