# CODE A NEURAL NETWORK IN PLAIN NUMPY Part 2: Planar data classification with one hidden layer

In the last post we have seen neural network with only two layers that is “Input layer” and “Output layer”, which is like a logistic regression algorithm. However in this post we are going to code a Neural network with one more layer that is “hidden layer”.

You will learn how to:

• Implement a 2-class classification neural network with a single hidden layer
• Use units with a non-linear activation function, such as tanh
• Compute the cross entropy loss
• Implement forward and backward propagation
```# Package imports
import numpy as np
import matplotlib.pyplot as plt
from testCases_v2 import *
import sklearn
import sklearn.datasets
import sklearn.linear_model
%matplotlib inline
np.random.seed(1) # set a seed so that the results are consistent
```

## Dataset

First, let’s get the dataset you will work on. The following code will load a “flower” 2-class dataset into variables X and Y.In :

`X, Y = load_planar_dataset()`

Visualize the dataset using matplotlib. The data looks like a “flower” with some red (label y=0) and some blue (y=1) points. Your goal is to build a model to fit this data. In other words, we want the classifier to define regions as either red or blue.In :

`plt.scatter(X[0, :], X[1, :], c=Y, s=40, cmap=plt.cm.Spectral)`

Out:

`<matplotlib.collections.PathCollection at 0x27c7e1ee7f0>`

You have:

• a numpy-array (matrix) X that contains your features (x1, x2)
• a numpy-array (vector) Y that contains your labels (red:0, blue:1).

Lets first get a better sense of what our data is like.

## Exercise:

How many training examples do you have? In addition, what is the shape of the variables X and Y?In :

`X.shape,Y.shape`

Out:

`((2, 400), (1, 400))`

In :

`X.T.shape,Y.T.shape`

Out:

`((400, 2), (400, 1))`

## Simple Logistic Regression

In :

`clf = sklearn.linear_model.LogisticRegressionCV();`

Before building a full neural network, lets first see how logistic regression performs on this problem. You can use sklearn’s built-in functions to do that. Run the code below to train a logistic regression classifier on the dataset.In :

`clf.fit(X.T,Y.T)`
```anaconda3\envs\tf\lib\site-packages\sklearn\utils\validation.py:761: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel().
y = column_or_1d(y, warn=True)
anaconda3\envs\tf\lib\site-packages\sklearn\model_selection\_split.py:2053: FutureWarning: You should specify a value for 'cv' instead of relying on the default value. The default value will change from 3 to 5 in version 0.22.
warnings.warn(CV_WARNING, FutureWarning)```

Out:

```LogisticRegressionCV(Cs=10, class_weight=None, cv='warn', dual=False,
fit_intercept=True, intercept_scaling=1.0, max_iter=100,
multi_class='warn', n_jobs=None, penalty='l2',
random_state=None, refit=True, scoring=None, solver='lbfgs',
tol=0.0001, verbose=0)```

You can now plot the decision boundary of these models. Run the code below.In :

```# Plot the decision boundary for logistic regression
plot_decision_boundary(lambda x: clf.predict(x), X, Y)
plt.title("Logistic Regression")

# Print accuracy
LR_predictions = clf.predict(X.T)
print ('Accuracy of logistic regression: %d ' % float((np.dot(Y, LR_predictions) + np.dot(1 - Y,1 - LR_predictions)) / float(Y.size) * 100) +
'% ' + "(percentage of correctly labelled datapoints)")```
`Accuracy of logistic regression: 47 % (percentage of correctly labelled datapoints)`

Interpretation: The dataset is not linearly separable, so logistic regression doesn’t perform well. Hopefully a neural network will do better. Let’s try this now!

## The general methodology to build a Neural Network is to:

1. Define the neural network structure ( # of input units, # of hidden units, etc).

2. Initialize the model’s parameters

3. Loop: – Implement forward propagation – Compute loss – Implement backward propagation to get the gradients – Update parameters (gradient descent)

## Exercise:

Define three variables: –

n_x: the size of the input layer –

n_h: the size of the hidden layer (set this to 4) –

n_y: the size of the output layerIn :

```def layer_sizes(X, Y):
n_x=X.shape
n_h=4
n_y=Y.shape
return (n_x,n_h,n_y)
```

In :

```X_assess, Y_assess = layer_sizes_test_case()
(n_x, n_h, n_y) = layer_sizes(X_assess, Y_assess)
print("The size of the input layer is: n_x = " + str(n_x))
print("The size of the hidden layer is: n_h = " + str(n_h))
print("The size of the output layer is: n_y = " + str(n_y))
```
```The size of the input layer is: n_x = 5
The size of the hidden layer is: n_h = 4
The size of the output layer is: n_y = 2
```

## Exercise:

Implement the function initialize_parameters()
Instructions: Make sure your parameters’ sizes are right.
Refer to the neural network figure above if needed.
You will initialize the weights matrices with random values.
Use: np.random.randn(a,b) * 0.01 to randomly initialize a matrix of shape (a,b).
You will initialize the bias vectors as zeros.
Use: np.zeros((a,b)) to initialize a matrix of shape (a,b) with zeros.In :

```def initialize_parameters(n_x, n_h, n_y):
np.random.seed(2)
W1=np.random.randn(n_h,n_x)*.01
b1=np.zeros((n_h,1))
W2=np.random.randn(n_y,n_h)*.01
b2=np.zeros((n_y,1))
assert (W1.shape == (n_h, n_x))
assert (b1.shape == (n_h, 1))
assert (W2.shape == (n_y, n_h))
assert (b2.shape == (n_y, 1))

parameters = {"W1": W1,
"b1": b1,
"W2": W2,
"b2": b2}

return parameters
```

In :

```n_x, n_h, n_y = initialize_parameters_test_case()

parameters = initialize_parameters(n_x, n_h, n_y)
print("W1 = " + str(parameters["W1"]))
print("b1 = " + str(parameters["b1"]))
print("W2 = " + str(parameters["W2"]))
print("b2 = " + str(parameters["b2"]))
```
```W1 = [[-0.00416758 -0.00056267]
[-0.02136196  0.01640271]
[-0.01793436 -0.00841747]
[ 0.00502881 -0.01245288]]
b1 = [[0.]
[0.]
[0.]
[0.]]
W2 = [[-0.01057952 -0.00909008  0.00551454  0.02292208]]
b2 = [[0.]]
```

## Exercise:

Implement forward_propagation()In :

```def forward_propagation(X, parameters):
W1=parameters["W1"]
W2=parameters["W2"]
b1=parameters["b1"]
b2=parameters["b2"]

Z1=np.dot(W1,X)+b1
A1=np.tanh(Z1)
Z2=np.dot(W2,A1)+b2
A2=sigmoid(Z2)

assert(A2.shape == (1, X.shape))
cache = {"Z1": Z1,
"A1": A1,
"Z2": Z2,
"A2": A2}
return A2, cache
```

In :

```X_assess, parameters = forward_propagation_test_case()

A2, cache = forward_propagation(X_assess, parameters)

# Note: we use the mean here just to make sure that your output matches ours.
print(np.mean(cache['Z1']), np.mean(cache['A1']), np.mean(cache['Z2']), np.mean(cache['A2']))
```
```0.26281864019752443 0.09199904522700109 -1.3076660128732143 0.21287768171914198
```

## Exercise:

Implement compute_cost() to compute the value of the cost J.

In :

```def compute_cost(A2, Y, parameters):
m=Y.shape
cost=(-1/m)*(np.sum((np.multiply(Y,np.log(A2))+(np.multiply((1-Y),np.log(1-A2))))))
cost = np.squeeze(cost)     # makes sure cost is the dimension we expect.
assert(isinstance(cost, float))
return cost
```

In :

```A2, Y_assess, parameters = compute_cost_test_case()
print("cost = " + str(compute_cost(A2, Y_assess, parameters)))
```
```cost = 0.6930587610394646
```

## Exercise:

Implement the function backward_propagation()

In :

```def backward_propagation(parameters, cache, X, Y):
m=Y.shape
Z1=cache["Z1"]
A1=cache["A1"]
Z2=cache["Z2"]
A2=cache["A2"]

dZ2=A2-Y
dW2=(1/m)*np.dot(dZ2,A1.T)
db2=(1/m)*np.sum(dZ2,axis=1,keepdims=True)

dZ1=np.multiply(np.dot(parameters["W2"].T,dZ2),1-np.power(A1,2))
dW1=(1/m)*np.dot(dZ1,X.T)
db1=(1/m)*np.sum(dZ1,axis=1,keepdims=True)

"db1": db1,
"dW2": dW2,
"db2": db2}

```

In :

```parameters, cache, X_assess, Y_assess = backward_propagation_test_case()

grads = backward_propagation(parameters, cache, X_assess, Y_assess)
```
```dW1 = [[ 0.00301023 -0.00747267]
[ 0.00257968 -0.00641288]
[-0.00156892  0.003893  ]
[-0.00652037  0.01618243]]
db1 = [[ 0.00176201]
[ 0.00150995]
[-0.00091736]
[-0.00381422]]
dW2 = [[ 0.00078841  0.01765429 -0.00084166 -0.01022527]]
db2 = [[-0.16655712]]
```

## Exercise:

Implement the update rule.
You have to use (dW1, db1, dW2, db2) in order to update (W1, b1, W2, b2).In :

```def update_parameters(parameters, grads, learning_rate = 1.2):

W1=parameters["W1"]
W2=parameters["W2"]
b1=parameters["b1"]
b2=parameters["b2"]

W1=W1-np.multiply(learning_rate,dw1)
b1=b1-np.multiply(learning_rate,db1)
W2=W2-np.multiply(learning_rate,dw2)
b2=b2-np.multiply(learning_rate,db2)

parameters = {"W1": W1,
"b1": b1,
"W2": W2,
"b2": b2}

return parameters
```

In :

```parameters, grads = update_parameters_test_case()

print("W1 = " + str(parameters["W1"]))
print("b1 = " + str(parameters["b1"]))
print("W2 = " + str(parameters["W2"]))
print("b2 = " + str(parameters["b2"]))
```
```W1 = [[-0.00643025  0.01936718]
[-0.02410458  0.03978052]
[-0.01653973 -0.02096177]
[ 0.01046864 -0.05990141]]
b1 = [[-1.02420756e-06]
[ 1.27373948e-05]
[ 8.32996807e-07]
[-3.20136836e-06]]
W2 = [[-0.01041081 -0.04463285  0.01758031  0.04747113]]
b2 = [[0.00010457]]
```

## Exercise:

Build your neural network model in nn_model()In :

```def nn_model(X, Y, n_h, num_iterations = 10000, print_cost=True):
np.random.seed(3)
(n_x,n_h,n_y)=layer_sizes(X, Y)
n_h=n_h
parameters = initialize_parameters(n_x, n_h, n_y)
costs=[]
for i in range(num_iterations):
A2, cache = forward_propagation(X, parameters)
cost=compute_cost(A2, Y, parameters)
grads = backward_propagation(parameters, cache, X, Y)
# Print the cost every 1000 iterations
if print_cost and i % 1000 == 0:
print ("Cost after iteration %i: %f" % (i, cost))
return parameters
```

In :

```X_assess, Y_assess = nn_model_test_case()

parameters = nn_model(X_assess, Y_assess, 4, num_iterations=10000, print_cost=True)
print("W1 = " + str(parameters["W1"]))
print("b1 = " + str(parameters["b1"]))
print("W2 = " + str(parameters["W2"]))
print("b2 = " + str(parameters["b2"]))
```
```Cost after iteration 0: 0.692739
Cost after iteration 1000: 0.000218
Cost after iteration 2000: 0.000107
Cost after iteration 3000: 0.000071
Cost after iteration 4000: 0.000053
Cost after iteration 5000: 0.000042
Cost after iteration 6000: 0.000035
Cost after iteration 7000: 0.000030
Cost after iteration 8000: 0.000026
Cost after iteration 9000: 0.000023
W1 = [[-0.65848169  1.21866811]
[-0.76204273  1.39377573]
[ 0.5792005  -1.10397703]
[ 0.76773391 -1.41477129]]
b1 = [[ 0.287592  ]
[ 0.3511264 ]
[-0.2431246 ]
[-0.35772805]]
W2 = [[-2.45566237 -3.27042274  2.00784958  3.36773273]]
b2 = [[0.20459656]]
```

## Exercise:

Use your model to predict by building predict(). Use forward propagation to predict results.

In :

```def predict(parameters, X):
A2, cache = forward_propagation(X, parameters)
predictions = np.round(A2)
return predictions
```

In :

```parameters, X_assess = predict_test_case()

predictions = predict(parameters, X_assess)
print("predictions mean = " + str(np.mean(predictions)))
```
```predictions mean = 0.6666666666666666
```

It is time to run the model and see how it performs on a planar dataset. Run the following code to test your model with a single hidden layerIn :

```# Build a model with a n_h-dimensional hidden layer
parameters = nn_model(X, Y, n_h = 4, num_iterations = 10000, print_cost=True)

# Plot the decision boundary
plot_decision_boundary(lambda x: predict(parameters, x.T), X, Y)
plt.title("Decision Boundary for hidden layer size " + str(4))
```
```Cost after iteration 0: 0.693048
Cost after iteration 1000: 0.288083
Cost after iteration 2000: 0.254385
Cost after iteration 3000: 0.233864
Cost after iteration 4000: 0.226792
Cost after iteration 5000: 0.222644
Cost after iteration 6000: 0.219731
Cost after iteration 7000: 0.217504
Cost after iteration 8000: 0.219504
Cost after iteration 9000: 0.218571
```

Out:

`Text(0.5, 1.0, 'Decision Boundary for hidden layer size 4')`

In :

```# Print accuracy
predictions = predict(parameters, X)
print ('Accuracy: %d' % float((np.dot(Y,predictions.T) + np.dot(1-Y,1-predictions.T))/float(Y.size)*100) + '%')
```
```Accuracy: 90%
```

## Refrences:

https://www.coursera.org/ Deep learning Specialization

### planar_utils.py file code below:

In :

```# import matplotlib.pyplot as plt
# import numpy as np
# import sklearn
# import sklearn.datasets
# import sklearn.linear_model

# def plot_decision_boundary(model, X, y):
#     # Set min and max values and give it some padding
#     x_min, x_max = X[0, :].min() - 1, X[0, :].max() + 1
#     y_min, y_max = X[1, :].min() - 1, X[1, :].max() + 1
#     h = 0.01
#     # Generate a grid of points with distance h between them
#     xx, yy = np.meshgrid(np.arange(x_min, x_max, h), np.arange(y_min, y_max, h))
#     # Predict the function value for the whole grid
#     Z = model(np.c_[xx.ravel(), yy.ravel()])
#     Z = Z.reshape(xx.shape)
#     # Plot the contour and training examples
#     plt.contourf(xx, yy, Z, cmap=plt.cm.Spectral)
#     plt.ylabel('x2')
#     plt.xlabel('x1')
#     plt.scatter(X[0, :], X[1, :], c=y, cmap=plt.cm.Spectral)

# def sigmoid(x):
#     """
#     Compute the sigmoid of x
#     Arguments:
#     x -- A scalar or numpy array of any size.
#     Return:
#     s -- sigmoid(x)
#     """
#     s = 1/(1+np.exp(-x))
#     return s

#     np.random.seed(1)
#     m = 400 # number of examples
#     N = int(m/2) # number of points per class
#     D = 2 # dimensionality
#     X = np.zeros((m,D)) # data matrix where each row is a single example
#     Y = np.zeros((m,1), dtype='uint8') # labels vector (0 for red, 1 for blue)
#     a = 4 # maximum ray of the flower

#     for j in range(2):
#         ix = range(N*j,N*(j+1))
#         t = np.linspace(j*3.12,(j+1)*3.12,N) + np.random.randn(N)*0.2 # theta
#         r = a*np.sin(4*t) + np.random.randn(N)*0.2 # radius
#         X[ix] = np.c_[r*np.sin(t), r*np.cos(t)]
#         Y[ix] = j

#     X = X.T
#     Y = Y.T

#     return X, Y

#     N = 200
#     noisy_circles = sklearn.datasets.make_circles(n_samples=N, factor=.5, noise=.3)
#     noisy_moons = sklearn.datasets.make_moons(n_samples=N, noise=.2)
#     blobs = sklearn.datasets.make_blobs(n_samples=N, random_state=5, n_features=2, centers=6)
#     gaussian_quantiles = sklearn.datasets.make_gaussian_quantiles(mean=None, cov=0.5, n_samples=N, n_features=2, n_classes=2, shuffle=True, random_state=None)
#     no_structure = np.random.rand(N, 2), np.random.rand(N, 2)

#     return noisy_circles, noisy_moons, blobs, gaussian_quantiles, no_structure```

## One thought on “CODE A NEURAL NETWORK IN PLAIN NUMPY Part 2: Planar data classification with one hidden layer”

1. Hypnotherapists says:

Trulife Distribution – Nutrition Distribution helps our clients achieve success in a complex, competitive retail environment. Our team of nutrition industry experts takes care of everything from importation compliance to marketing, sales and distribution at the ground level. There is no need to navigate the complicated intricacies of the American market when we have already done the work. Let us use our experience to expand your brand and put your product into the hands of American consumers. https://trulifedist.com/

Like