# CODE A NEURAL NETWORK IN PLAIN NUMPY Part 2: Planar data classification with one hidden layer

In the last post we have seen neural network with only two layers that is “Input layer” and “Output layer”, which is like a logistic regression algorithm. However in this post we are going to code a Neural network with one more layer that is “hidden layer”.

You will learn how to:

• Implement a 2-class classification neural network with a single hidden layer
• Use units with a non-linear activation function, such as tanh
• Compute the cross entropy loss
• Implement forward and backward propagation
```# Package imports
import numpy as np
import matplotlib.pyplot as plt
from testCases_v2 import *
import sklearn
import sklearn.datasets
import sklearn.linear_model
%matplotlib inline
np.random.seed(1) # set a seed so that the results are consistent
```

## Dataset

First, let’s get the dataset you will work on. The following code will load a “flower” 2-class dataset into variables X and Y.In :

`X, Y = load_planar_dataset()`

Visualize the dataset using matplotlib. The data looks like a “flower” with some red (label y=0) and some blue (y=1) points. Your goal is to build a model to fit this data. In other words, we want the classifier to define regions as either red or blue.In :

`plt.scatter(X[0, :], X[1, :], c=Y, s=40, cmap=plt.cm.Spectral)`

Out:

`<matplotlib.collections.PathCollection at 0x27c7e1ee7f0>`

You have:

• a numpy-array (matrix) X that contains your features (x1, x2)
• a numpy-array (vector) Y that contains your labels (red:0, blue:1).

Lets first get a better sense of what our data is like.

## Exercise:

How many training examples do you have? In addition, what is the shape of the variables X and Y?In :

`X.shape,Y.shape`

Out:

`((2, 400), (1, 400))`

In :

`X.T.shape,Y.T.shape`

Out:

`((400, 2), (400, 1))`

## Simple Logistic Regression

In :

`clf = sklearn.linear_model.LogisticRegressionCV();`

Before building a full neural network, lets first see how logistic regression performs on this problem. You can use sklearn’s built-in functions to do that. Run the code below to train a logistic regression classifier on the dataset.In :

`clf.fit(X.T,Y.T)`
```anaconda3\envs\tf\lib\site-packages\sklearn\utils\validation.py:761: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel().
y = column_or_1d(y, warn=True)
anaconda3\envs\tf\lib\site-packages\sklearn\model_selection\_split.py:2053: FutureWarning: You should specify a value for 'cv' instead of relying on the default value. The default value will change from 3 to 5 in version 0.22.
warnings.warn(CV_WARNING, FutureWarning)```

Out:

```LogisticRegressionCV(Cs=10, class_weight=None, cv='warn', dual=False,
fit_intercept=True, intercept_scaling=1.0, max_iter=100,
multi_class='warn', n_jobs=None, penalty='l2',
random_state=None, refit=True, scoring=None, solver='lbfgs',
tol=0.0001, verbose=0)```

You can now plot the decision boundary of these models. Run the code below.In :

```# Plot the decision boundary for logistic regression
plot_decision_boundary(lambda x: clf.predict(x), X, Y)
plt.title("Logistic Regression")

# Print accuracy
LR_predictions = clf.predict(X.T)
print ('Accuracy of logistic regression: %d ' % float((np.dot(Y, LR_predictions) + np.dot(1 - Y,1 - LR_predictions)) / float(Y.size) * 100) +
'% ' + "(percentage of correctly labelled datapoints)")```
`Accuracy of logistic regression: 47 % (percentage of correctly labelled datapoints)`

Interpretation: The dataset is not linearly separable, so logistic regression doesn’t perform well. Hopefully a neural network will do better. Let’s try this now!

## The general methodology to build a Neural Network is to:

1. Define the neural network structure ( # of input units, # of hidden units, etc).

2. Initialize the model’s parameters

3. Loop: – Implement forward propagation – Compute loss – Implement backward propagation to get the gradients – Update parameters (gradient descent)

## Exercise:

Define three variables: –

n_x: the size of the input layer –

n_h: the size of the hidden layer (set this to 4) –

n_y: the size of the output layerIn :

```def layer_sizes(X, Y):
n_x=X.shape
n_h=4
n_y=Y.shape
return (n_x,n_h,n_y)
```

In :

```X_assess, Y_assess = layer_sizes_test_case()
(n_x, n_h, n_y) = layer_sizes(X_assess, Y_assess)
print("The size of the input layer is: n_x = " + str(n_x))
print("The size of the hidden layer is: n_h = " + str(n_h))
print("The size of the output layer is: n_y = " + str(n_y))
```
```The size of the input layer is: n_x = 5
The size of the hidden layer is: n_h = 4
The size of the output layer is: n_y = 2
```

## Exercise:

Implement the function initialize_parameters()
Instructions: Make sure your parameters’ sizes are right.
Refer to the neural network figure above if needed.
You will initialize the weights matrices with random values.
Use: np.random.randn(a,b) * 0.01 to randomly initialize a matrix of shape (a,b).
You will initialize the bias vectors as zeros.
Use: np.zeros((a,b)) to initialize a matrix of shape (a,b) with zeros.In :

```def initialize_parameters(n_x, n_h, n_y):
np.random.seed(2)
W1=np.random.randn(n_h,n_x)*.01
b1=np.zeros((n_h,1))
W2=np.random.randn(n_y,n_h)*.01
b2=np.zeros((n_y,1))
assert (W1.shape == (n_h, n_x))
assert (b1.shape == (n_h, 1))
assert (W2.shape == (n_y, n_h))
assert (b2.shape == (n_y, 1))

parameters = {"W1": W1,
"b1": b1,
"W2": W2,
"b2": b2}

return parameters
```

In :

```n_x, n_h, n_y = initialize_parameters_test_case()

parameters = initialize_parameters(n_x, n_h, n_y)
print("W1 = " + str(parameters["W1"]))
print("b1 = " + str(parameters["b1"]))
print("W2 = " + str(parameters["W2"]))
print("b2 = " + str(parameters["b2"]))
```
```W1 = [[-0.00416758 -0.00056267]
[-0.02136196  0.01640271]
[-0.01793436 -0.00841747]
[ 0.00502881 -0.01245288]]
b1 = [[0.]
[0.]
[0.]
[0.]]
W2 = [[-0.01057952 -0.00909008  0.00551454  0.02292208]]
b2 = [[0.]]
```

## Exercise:

Implement forward_propagation()In :

```def forward_propagation(X, parameters):
W1=parameters["W1"]
W2=parameters["W2"]
b1=parameters["b1"]
b2=parameters["b2"]

Z1=np.dot(W1,X)+b1
A1=np.tanh(Z1)
Z2=np.dot(W2,A1)+b2
A2=sigmoid(Z2)

assert(A2.shape == (1, X.shape))
cache = {"Z1": Z1,
"A1": A1,
"Z2": Z2,
"A2": A2}
return A2, cache
```

In :

```X_assess, parameters = forward_propagation_test_case()

A2, cache = forward_propagation(X_assess, parameters)

# Note: we use the mean here just to make sure that your output matches ours.
print(np.mean(cache['Z1']), np.mean(cache['A1']), np.mean(cache['Z2']), np.mean(cache['A2']))
```
```0.26281864019752443 0.09199904522700109 -1.3076660128732143 0.21287768171914198
```

## Exercise:

Implement compute_cost() to compute the value of the cost J.

In :

```def compute_cost(A2, Y, parameters):
m=Y.shape
cost=(-1/m)*(np.sum((np.multiply(Y,np.log(A2))+(np.multiply((1-Y),np.log(1-A2))))))
cost = np.squeeze(cost)     # makes sure cost is the dimension we expect.
assert(isinstance(cost, float))
return cost
```

In :

```A2, Y_assess, parameters = compute_cost_test_case()
print("cost = " + str(compute_cost(A2, Y_assess, parameters)))
```
```cost = 0.6930587610394646
```

## Exercise:

Implement the function backward_propagation()

In :

```def backward_propagation(parameters, cache, X, Y):
m=Y.shape
Z1=cache["Z1"]
A1=cache["A1"]
Z2=cache["Z2"]
A2=cache["A2"]

dZ2=A2-Y
dW2=(1/m)*np.dot(dZ2,A1.T)
db2=(1/m)*np.sum(dZ2,axis=1,keepdims=True)

dZ1=np.multiply(np.dot(parameters["W2"].T,dZ2),1-np.power(A1,2))
dW1=(1/m)*np.dot(dZ1,X.T)
db1=(1/m)*np.sum(dZ1,axis=1,keepdims=True)

"db1": db1,
"dW2": dW2,
"db2": db2}

```

In :

```parameters, cache, X_assess, Y_assess = backward_propagation_test_case()

grads = backward_propagation(parameters, cache, X_assess, Y_assess)
```
```dW1 = [[ 0.00301023 -0.00747267]
[ 0.00257968 -0.00641288]
[-0.00156892  0.003893  ]
[-0.00652037  0.01618243]]
db1 = [[ 0.00176201]
[ 0.00150995]
[-0.00091736]
[-0.00381422]]
dW2 = [[ 0.00078841  0.01765429 -0.00084166 -0.01022527]]
db2 = [[-0.16655712]]
```

## Exercise:

Implement the update rule.
You have to use (dW1, db1, dW2, db2) in order to update (W1, b1, W2, b2).In :

```def update_parameters(parameters, grads, learning_rate = 1.2):

W1=parameters["W1"]
W2=parameters["W2"]
b1=parameters["b1"]
b2=parameters["b2"]

W1=W1-np.multiply(learning_rate,dw1)
b1=b1-np.multiply(learning_rate,db1)
W2=W2-np.multiply(learning_rate,dw2)
b2=b2-np.multiply(learning_rate,db2)

parameters = {"W1": W1,
"b1": b1,
"W2": W2,
"b2": b2}

return parameters
```

In :

```parameters, grads = update_parameters_test_case()

print("W1 = " + str(parameters["W1"]))
print("b1 = " + str(parameters["b1"]))
print("W2 = " + str(parameters["W2"]))
print("b2 = " + str(parameters["b2"]))
```
```W1 = [[-0.00643025  0.01936718]
[-0.02410458  0.03978052]
[-0.01653973 -0.02096177]
[ 0.01046864 -0.05990141]]
b1 = [[-1.02420756e-06]
[ 1.27373948e-05]
[ 8.32996807e-07]
[-3.20136836e-06]]
W2 = [[-0.01041081 -0.04463285  0.01758031  0.04747113]]
b2 = [[0.00010457]]
```

## Exercise:

Build your neural network model in nn_model()In :

```def nn_model(X, Y, n_h, num_iterations = 10000, print_cost=True):
np.random.seed(3)
(n_x,n_h,n_y)=layer_sizes(X, Y)
n_h=n_h
parameters = initialize_parameters(n_x, n_h, n_y)
costs=[]
for i in range(num_iterations):
A2, cache = forward_propagation(X, parameters)
cost=compute_cost(A2, Y, parameters)
grads = backward_propagation(parameters, cache, X, Y)
# Print the cost every 1000 iterations
if print_cost and i % 1000 == 0:
print ("Cost after iteration %i: %f" % (i, cost))
return parameters
```

In :

```X_assess, Y_assess = nn_model_test_case()

parameters = nn_model(X_assess, Y_assess, 4, num_iterations=10000, print_cost=True)
print("W1 = " + str(parameters["W1"]))
print("b1 = " + str(parameters["b1"]))
print("W2 = " + str(parameters["W2"]))
print("b2 = " + str(parameters["b2"]))
```
```Cost after iteration 0: 0.692739
Cost after iteration 1000: 0.000218
Cost after iteration 2000: 0.000107
Cost after iteration 3000: 0.000071
Cost after iteration 4000: 0.000053
Cost after iteration 5000: 0.000042
Cost after iteration 6000: 0.000035
Cost after iteration 7000: 0.000030
Cost after iteration 8000: 0.000026
Cost after iteration 9000: 0.000023
W1 = [[-0.65848169  1.21866811]
[-0.76204273  1.39377573]
[ 0.5792005  -1.10397703]
[ 0.76773391 -1.41477129]]
b1 = [[ 0.287592  ]
[ 0.3511264 ]
[-0.2431246 ]
[-0.35772805]]
W2 = [[-2.45566237 -3.27042274  2.00784958  3.36773273]]
b2 = [[0.20459656]]
```

## Exercise:

Use your model to predict by building predict(). Use forward propagation to predict results.

In :

```def predict(parameters, X):
A2, cache = forward_propagation(X, parameters)
predictions = np.round(A2)
return predictions
```

In :

```parameters, X_assess = predict_test_case()

predictions = predict(parameters, X_assess)
print("predictions mean = " + str(np.mean(predictions)))
```
```predictions mean = 0.6666666666666666
```

It is time to run the model and see how it performs on a planar dataset. Run the following code to test your model with a single hidden layerIn :

```# Build a model with a n_h-dimensional hidden layer
parameters = nn_model(X, Y, n_h = 4, num_iterations = 10000, print_cost=True)

# Plot the decision boundary
plot_decision_boundary(lambda x: predict(parameters, x.T), X, Y)
plt.title("Decision Boundary for hidden layer size " + str(4))
```
```Cost after iteration 0: 0.693048
Cost after iteration 1000: 0.288083
Cost after iteration 2000: 0.254385
Cost after iteration 3000: 0.233864
Cost after iteration 4000: 0.226792
Cost after iteration 5000: 0.222644
Cost after iteration 6000: 0.219731
Cost after iteration 7000: 0.217504
Cost after iteration 8000: 0.219504
Cost after iteration 9000: 0.218571
```

Out:

`Text(0.5, 1.0, 'Decision Boundary for hidden layer size 4')`

In :

```# Print accuracy
predictions = predict(parameters, X)
print ('Accuracy: %d' % float((np.dot(Y,predictions.T) + np.dot(1-Y,1-predictions.T))/float(Y.size)*100) + '%')
```
```Accuracy: 90%
```

## Refrences:

https://www.coursera.org/ Deep learning Specialization

### planar_utils.py file code below:

In :

```# import matplotlib.pyplot as plt
# import numpy as np
# import sklearn
# import sklearn.datasets
# import sklearn.linear_model

# def plot_decision_boundary(model, X, y):
#     # Set min and max values and give it some padding
#     x_min, x_max = X[0, :].min() - 1, X[0, :].max() + 1
#     y_min, y_max = X[1, :].min() - 1, X[1, :].max() + 1
#     h = 0.01
#     # Generate a grid of points with distance h between them
#     xx, yy = np.meshgrid(np.arange(x_min, x_max, h), np.arange(y_min, y_max, h))
#     # Predict the function value for the whole grid
#     Z = model(np.c_[xx.ravel(), yy.ravel()])
#     Z = Z.reshape(xx.shape)
#     # Plot the contour and training examples
#     plt.contourf(xx, yy, Z, cmap=plt.cm.Spectral)
#     plt.ylabel('x2')
#     plt.xlabel('x1')
#     plt.scatter(X[0, :], X[1, :], c=y, cmap=plt.cm.Spectral)

# def sigmoid(x):
#     """
#     Compute the sigmoid of x
#     Arguments:
#     x -- A scalar or numpy array of any size.
#     Return:
#     s -- sigmoid(x)
#     """
#     s = 1/(1+np.exp(-x))
#     return s

#     np.random.seed(1)
#     m = 400 # number of examples
#     N = int(m/2) # number of points per class
#     D = 2 # dimensionality
#     X = np.zeros((m,D)) # data matrix where each row is a single example
#     Y = np.zeros((m,1), dtype='uint8') # labels vector (0 for red, 1 for blue)
#     a = 4 # maximum ray of the flower

#     for j in range(2):
#         ix = range(N*j,N*(j+1))
#         t = np.linspace(j*3.12,(j+1)*3.12,N) + np.random.randn(N)*0.2 # theta
#         r = a*np.sin(4*t) + np.random.randn(N)*0.2 # radius
#         X[ix] = np.c_[r*np.sin(t), r*np.cos(t)]
#         Y[ix] = j

#     X = X.T
#     Y = Y.T

#     return X, Y