Neural Networks and Deep Learning

  • Most of the popular applications have been developed using Supervised Learning, carefully selecting x and the output y and training a network accordingly.
  • Supervised Learning
    • Structured Data
    • Unstructured Data = Audio/ Images
  • Why is Deep Learning taking off?
    Screenshot from 2018-01-22 00-37-52.png

Q. How to go through entire training set of m examples without an explicit for loop?

Screenshot from 2018-01-24 17-40-37.png

Screenshot from 2018-01-24 18-13-39.png

Screenshot from 2018-01-24 18-31-41.png

Screenshot from 2018-01-24 19-13-17.png

Screenshot from 2018-01-25 18-18-17.png

Week 3

1.png

2.png

Activation Functions

ReLU / tanh(centers data to 0) >> sigmoid(centers data to 0.5)
Sigmoid is useful only for output layer

Leaky ReLU= Avoid vanishing gradient for negative weights

The vanishing/exploding gradient problem: too much/little change for a weight= bad as it slows down learning.

Why non-linear activation function?

Removing activation functions makes the entire computation as linear! This eliminates the ability to learn complex boundary surfaces, and the entire network may not even need the deep layers! Basically makes the NN as equal to standard regressive model.

Linear activation function is useful only when using Neural Networks for a Linear Regression problem.

1.png

2.png

Why randomly initialize weights?

If all weights are 0, the hidden layer comput1ations are identical, effectively making all neurons equal and hence, redundant.

Week 4

1.png

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s