Optimization Objective
Optimization Objective

Supervised learning : no matter what algorithm, more importance is get lot of data, and choosing wisely which features to be incoporated, regularization etc.

One of most powerful SL; SVM: compared to logistic, neural networks, more powerful nonlinear function.

Use in many industries. Last supervised learning algorithms that spent lot amount of time by Andrew Ng

SVM : modified logistic regression

The hypothesis hopes to approach the actual output; the value of z must be much larger than zero. By the value much much larger than zero, then the hypothesis will be approaching 1. And works the other way around

Remember that these alternative view is trying to nullify each other depends on whether y == 1 or y == 0

The formula above, is when we are using 1 training example. Notice the z value of 1, that is a joint point between slanted line and flat line.

In logistic regression, with y =1, as the cost function approach zero(approach actual output), the z becomes much larger than zero.

And on the bottom right, with y = 0
as the cost function approach zero(approach actual output), the z becomes much smaller than zero.

In using SVM, we modified the logistric regression to be linearity. Using two line, slate going down to right, and flat right. By doing these, we are giving SVM a big advantage and much easier to implement.

With gaining 2 linear formula denotes by cost1(z) and cost0(z), we are preparing to implement SVM

For SVM, get rid of 1/m, this is the other convention made for SVM. that is if we get rid m, that the features won't likely to change.See the examples in red for more intuition

For the other convention, we simplify the equation with denotes the cost function with A, and regularization term with B.

Now we''re modifying, the other convention to be just like the simple term on the right.

lambda nullified, instead we are using c term.

Is not that c = 1/lambda. But it posses equal result to lambda.

The c value is really small so that B can have higher weight than cA. We don't weight B anymore, but instead makes A much more lighter. This way, we still trying to minimize the cost function.

These are mathematical definition in SVM

Differ from logistic regression, SVM doesn't count probability anymore. Instead we are using either 1 or 0 (not range from 0 to 1 like in logistic) for condition stated above

Next, what hypothesis SVM result, dig more about Optimization Objective, and add a little bit to manage more complex nonlinear function.