Using SVM

perform feature scaling so one particular feature do not overly weight (nullify the other features) in example of size of houses nullify bedroom

use mercer's theorem to avoid invalid kernels. SVM back then by numerical computing, wants to be computed efficiently. So by generalizing, so SVM can be computed in optimal way, all kernels used with SVM must satisfy Mercer's Theorem
Polynomial kernel is less often used but work at some cases. Need two parameters, the constant, and its degree of polynomials
Worse than Gaussian, works when x is more or less negative
String: if input is string. find similarity between string. All these other kernels may be found across other scientist. But linear kernel and Gaussian kernel is two of the most popular SVM kernel

How to use SVM wisely in multiclass classification, is actually by using the built-in muticlass classification method that already inside whatever software package you use
There's always high chance built-in SVM for multiclass classification
Alternatively, use one vs all method. for each class, calculate theta1 to thetaK, with theta1 calculate for y=1, to thetaK calculate for y=k respectively. Then pick class i with largest hypothesis among theta.

When do we use between these two algorithm?
So many features without less data, then linear should be enough, because logistic will be hard, only increase complexity, and thus prone to overfitting, especially with only smaller training example.
The second condition is where SVM with Gaussian kernel outshine all other algorithm.
The third one is where SVM Gaussian kernel tends to fall in, even if we use software package. This is is the case where we talked earlier, that the number of parameters will match number of training example. So huge training example will increase complexity of calculation significantly.
logistic regression or SVM without kernel perform similar and have result similarity. Only in special case where one perfom better than the other.
Sometimes SVM built in package better than neural network, especially in specified regime mentioned above.
SVM also perfom complex non-linear function as convex, as it always found global optima, so need to worry about it found local optima.

SUMMARY
At beginning feeling vague of which the algorithm to use
Still widely recognized as one of the most popular powerful learning algorithm
Logistic Regression and Neural Networks are widely used for learning algorithm
Three algorithms (Logistic Regression, Neural Networks, SVM) alone in arsenal could be build state-of-the-art machine learning systems