-simpler cost function
-apply gradient descent to logistic regression
-fully ready logistic regression

this is more simpler and compact way for calculating cost function


taking advantage of y equals either 0 or 1,
making one line of cost function and disabling y each other..



one line of function is essentially used for linear regression
what's left is how to minimize J(theta)


the gradient descent function as shown above has include the derrivative function for cost of theta

almost the same as linear regression,  except for calculating the hypothesis which stated above...

same technique also applied for monitoring gradient descent converge to global optima as linear regression...
(plot the gradient descent function for every iteration and see   if  gradient descent decreasing every  iteration)


this is vectorized implementation for logistic regression...

there is feature scalling that makes faster gradient descent

this is the most widely used algorithm for classification...