-optimize for faster gradient descent
-scalling efficiently for lots of features
the new algorithm will optimize the cost function and it's derrivative terms
the 3 other optimization algorithms are very complex ND shouldn't be used unless you are a professional in numerical computing....
for every language, choose the best library by testing it for particular problem that we have... write our own code is not really recommended
after we write code at right... we now be able to write optimization code
given options that stared,
octave function fminunc will take pointer, name of function, opt theta, that automatically chooses learning rate...
optimset = set of options for optimization
here's how to implement in octave, after we defined cost function
exit flag is there to makes sure the cost function is converged...
initialTheta has to be 2 element vector minimal
this is optimization for simple quadratic function...
remember that octave indexes starting from 1
we need the code for further optimize the cost function...
these are advanced optimize that can do better than gradient descent..