**Recommender Systems**

## An overview of the math behind Recommender Systems

Have you ever wondered how Netflix recommends movies to you? Has it ever occurred to you how sometimes a good many number of those recommended movies are the ones you end up binge watching and were very happy about it?

All this is nothing but the magic of Recommender Systems that uses Machine Learning to come up with super accurate recommendations for you. The idea behind Recommender System is that it essentially goes over all of the movies you’ve watched and rated in the past, learns what you like, and recommends which movie you would like to watch next.

Boggled? Let me break this down for you with an outline of the two famous types of Recommender Systems; Collaborative Filtering and Contend Based Filtering.

Let us first look at the collaborative filtering approach:

**Collaborative Filtering System:**

Collaborative Filtering Systems predict what you like depending on what other similar users have liked in the past.

In this approach, the algorithm follows the following approach step by step:

1. Consider user X

2. Find set N (Neighborhood of user X) of other users whose ratings are “most similar” to X’s ratings

3. Estimate X’s ratings based on ratings of users in N

**Measuring the ***“most similar”*

*“most similar”*

Let us consider a set of four users (U1, U2, U3, U4) as rows of a table and a set of movies (M1, M2…M7) as the columns of the table. Imagine a scale of ratings from 0 to 5. We have some of the ratings given by the users for the corresponding movies in the cells, while some of the cells are empty or has missing values.

If we look at the above table carefully, we will notice that U1 and U2 have rated only one movie (M1) in common but the ratings are fairly high. This implies that users U1 andU2 have similar tastes or interests. Whereas although the users U1 and U3 have rated two movies in common, the ratings are fairly low. Thus, we can conclude that intuitively that the users U1 and U3 are dissimilar and U1 and U2 are similar.

**SIM (U1, U2) > SIM (U1, U3)**

To calculate the similarity matrix between the two points, U1 and U2, we use the Cosine Similarity:

**SIM (U1, U2) = cosine (RU1, RU2),**

Where RU1, RU2 are the rating vectors of U1 and U2 respectively.

But in order to implement the cosine similarity, we need to calculate and fill in the empty values. So, instead of a cosine similarity, we need to normalize the ratings by subtracting the row mean and use what is known as the ‘Centered Cosine Similarity’.

Below is the table we get after normalizing the ratings:

If you observe, you will notice that the addition of all the ratings for a particular user yields to zero. This is because the ratings have now been centered around zero. The ratings above zero show a positive or high rating and those below zero indicate negative or low rating. Now, calculating the ‘Centered Cosine Similarity’ between users U1, U2 and U1, U3, we get:

**SIM (U1, U2) = cosine (RU1, RU2) = 0.09,** and

**SIM (U1, U3) = cosine (RU1, RU3) = -0.56**

**SIM (U1, U2) > SIM (U1, U3)**

The results indicate that U1 and U2 are highly similar whereas U1 and U3 are very unlike each other.

Now that we have estimated and grouped similar and dissimilar users, the next step is to make rating predictions for a user.

**Rating Predictions:**

Suppose a user *x *has a rating vector *Rx,* and we are required to make rating predictions for this user for item *i.* Using Centered Cosine Similarity, we find a neighborhood *N* of a set of *k* users (who have rated item *i) *who are the most similar to user *x. *To do so, we take the weighted average. In this method, for each user *y *in neighborhood *N, *we weight *y *rating for item *i *and multiply it by the similarity of *x *and *y. *Finally, we normalize it by dividing the product by the sum of the similarities between *x *and *y.* The result gives us an estimate for user *x *and item *y.*

*where Sxy = SIM (x, y)*

The technique that we used above is also sometimes called as the **User-User Collaborative Filtering**, as we try to find other users with similar interests in order to make predictions for other similar users. A dual approach to Collaborative Filtering is the** Item-Item Collaborative Filtering**, where instead of starting out with a user, we start with an item *i* and find a set of other items similar to *i. *Then we estimate rating for item *i *based on the ratings for similar items. We can use the same similar metrics and prediction approaches as the User-User model.

Let us consider a group of users U1, U2… U12 as the columns of the table and a set of movies M1, M2… M6 as the rows of the table.

The yellow cells have known ratings (in the scale of 1 to 5) and the white ones have empty ratings. And our goal here is to find the rating for movie 1 (M1) by user 5 (U5). The first step would be to find other movies that are similar to M1. To calculate similarity, we use the *Pearson Correlation technique*, which is *analogous *to the *Centered Cosine technique*, we used earlier. So, we take all the movies and calculate their individual centered cosine distances and list them up, with respect to M1.

If we take into account the movies that have positive centered cosine similarity with M1, we have two neighbors in the neighborhood *N* i.e. *N=2. *Using the same methods as in the case of User-User Collaborative Filtering,* *we calculate the similarity between M1 and M3, and between M1 and M6. We get:

**SIM (M1, M3) = 0.14**, and **SIM (M1, M6) = 0.59**

Taking the weighted average (using the formula mentioned earlier), we find that the predicted rating for user U1 for movie M1 is **2.6**.

Now that we have learnt Collaborative Filtering, let us dive into the Content Based Recommender System.

**Content based Recommender System:**

Content based Recommender Systems predict what you like depending on what you’ve liked in the past.

The diagram above shows a basic plan of action for the content based recommender system. We have it explained in the points below:

· Find *a set of items* liked by the user, by both explicit and implicit methods. For example, the items purchased by the user

· Using those set of items, an *Item Profile *is built — which is essentially a description of the items purchased by the user. Here in the diagram, the geometric shapes have been used for the sake of succinctness. So, we can conclude that the user likes items that are red in color, and are in the shapes of circles and triangles

· Next, from the item profile, we are going to infer a *User Profile, *that would contain information about the user regarding his/her likes and purchases

Now that we have the user and the item profiles, the next task would be to recommend certain items to the user.

· Given the user*, *we compute the similarity between that user and all the items available in the catalog. The similarity is calculated using the *Cosine Similarity technique.*

· We then pick the item with the highest cosine similarity, and recommend those to the user.

**Conclusion**

So, there we have the mathematics behind the building of our two important recommender systems approaches. In my next articles, I will be showing you how to implement and use each of these recommender systems, using python libraries.

Should you encounter any problem while reading this article or have any suggestions/advices for me, feel free to leave a comment below or shoot me an email: nafeea3000@gmail.com.