• the random will be the value between 0 and 1
  • the epsilon is init that we manually set, different to what we know about epsilon in gradient checking


  • In summary, random init value close to zero
  • do backprop, gradient checking, do advanced animation , to minimize cost function j(theta) using random init with symmetry breaking
  • This will find a good random value of theta