Content-based Recommendation

For the sake of the examples, we choose 2 features, as above, (will discussed later on how to choose features)
the ratio of rating for each feature of the movies will vary between 0 to 1.
Each feature will have feature vector, where x0 = 1 (inceptor)
Theta is the parameter rated category of each user, which later would be discussed
Theta is like how's one category is rated by user. x1 for user1 is 0.99, so she would rate movies in category x1 as 4.95
So is similar as linear regression problem and how we apply them
each feature is calculated on how much user would rate it, 5 rate = 1 in x.
And so each user would have different guessed rating. Each theta is evaluated on how particular user rate the categories/features
The value calculated on formula above(5x0.99 = 4.95) seems like a valid calculation
theta0 = 0(not counting x0),
theta1 = x1(rating user would give on romance),t
theta2 = x2(rating user would give on action)

choose theta that as close as possible in training set
Summation above means all the movies that user j has rated
Similar as linear regression problem, except without mj
n = number of features in movies
The formula would give algorithm that predict the movies rated
eliminate m(j) would still give same value theta j
theta j would be n +1 dimensional vector, because we have additional 8theta0 = 0

By now, we already have the formula without mj
So given each user, we also may want to know how much the parameters prediction for all users, and try to minimize he cost function
We also have different prediction for making prediction of all users rating.
What explained above, is we just added the summation for all users in the cost function as well as in the regularization

For Gradient descent update, depending of whether k = 0, we also have different formula
in blue gathered is the partial derrivation, and use it to minimize cost function in Gradient Descent

Also can use other advanced optimization to try to minimum cost function J as swell