Random Initialization

The last pieces in neural networks to be implemented

Init all theta to zero may eventually correct in logistic regression, but fail in neural networks

the result will be looping forever as the result(hidden units) is exactly is the same as input
The problem will be solved by random initialization

the random will be the value between 0 and 1
the epsilon is init that we manually set, different to what we know about epsilon in gradient checking

In summary, random init value close to zero
do backprop, gradient checking, do advanced animation , to minimize cost function j(theta) using random init with symmetry breaking
This will find a good random value of theta