Contentbased Recommendation
 in the last video, we are addressing Recommender Systems problem that have set of movies and set of users and the problem is try to recommend movies they never seen, how they would rate it
 One approach to handle this is called Contentbased Recommendation, which would discussed in this video
 For the sake of the examples, we choose 2 features, as above, (will discussed later on how to choose features)
 the ratio of rating for each feature of the movies will vary between 0 to 1.
 Each feature will have feature vector, where x0 = 1 (inceptor)
 Theta is the parameter rated category of each user, which later would be discussed
 Theta is like how's one category is rated by user. x1 for user1 is 0.99, so she would rate movies in category x1 as 4.95
 So is similar as linear regression problem and how we apply them
 each feature is calculated on how much user would rate it, 5 rate = 1 in x.
 And so each user would have different guessed rating. Each theta is evaluated on how particular user rate the categories/features
 The value calculated on formula above(5x0.99 = 4.95) seems like a valid calculation
 theta0 = 0(not counting x0),
 theta1 = x1(rating user would give on romance),t
 theta2 = x2(rating user would give on action)
 choose theta that as close as possible in training set
 Summation above means all the movies that user j has rated
 Similar as linear regression problem, except without mj
 n = number of features in movies
 The formula would give algorithm that predict the movies rated
 eliminate m(j) would still give same value theta j
 theta j would be n +1 dimensional vector, because we have additional 8theta0 = 0
 By now, we already have the formula without mj
 So given each user, we also may want to know how much the parameters prediction for all users, and try to minimize he cost function
 We also have different prediction for making prediction of all users rating.
 What explained above, is we just added the summation for all users in the cost function as well as in the regularization

For Gradient descent update, depending of whether k = 0, we also have different formula  in blue gathered is the partial derrivation, and use it to minimize cost function in Gradient Descent
 Also can use other advanced optimization to try to minimum cost function J as swell
SUMMARY
 CBR is one approach for Recommender Systems problem, capture the content of the various movies, based on rating.
 Sometimes we face diffcult problems, that is in movie example, how's the movie that doesn't have clear rating/category.
 Next, how Recommender System face unrating product, that is worked for all product for all the content