The problem of overfitting
The problem of overfitting algorithm
Regularization : a way to decrease overfitting problem
Regularization : a way to decrease overfitting problem
high bias(underfit) : misinterpreted line fit for the data
high variance(overfit) : lot of features (many high order of polynomials) but lack more data to give a good hypothesis
example for logistic regression
there is a tool for analyzing whether the algorithm has overfitting or underfitting...
a lot of features may risk a lot of high order polynomials....
making it even harder to visualize (in case of over 100 features)
first option, either manually decrease the features or automatically reduce number of features that will be discussed later in greater depth..
The disadvantage is we don't know whether the features particularly useful for latter, or even all features matter...