Lecture 5: Recommender systems
-understand what is a recommender system
- A recommender system is a subclass of information filtering system that seeks to predict the "rating" or "preference" a user would give to an item
- A Recommender System refers to a system that is capable of predicting the future preference of a set of items for a user, and recommend the top items.
-understand why missing data is an important issue for recommender systems
-understand what is collaborative filtering and the difference between user based and item based collaborative filtering
- Collaborative Filtering: Make predictions about a user’s missing data according to the behaviour of many other users
- Look at users collective behavior
- Look at the active user history
- Combine!
- Approach
- User based methods
- Identify like-minded users
- Item based methods
- Model (matrix) based methods (not examinable)
- Simultaneously identify like-minded users and items
- User based
- Achieve good quality in practice
- The more processing we push offline, the better the method scale
- However:
- User preference is dynamic
- High update frequency of offline-calculated information
- No recommendation for new users
- We don’t know much about them yet
- Item based
- Search for similarities among items
- All computations can be done offline
- Item-Item similarity is more stable than user-user similarity
- No need for frequent updates:
- Same as in user-user similarity but on item vectors
- Find similar items to the one whose rating is missing
- E.g. For item ii compute its similarity to each other item ij
- Offline phase. For each item
- Determine its k-most similar items
- Can use same type of similarity as for user-based
- Online phase:
- Predict rating raj for a given user-item pair as a weighted sum over the k-most similar items that they rated
-understand the difference between i) user based methods for collaborative filtering and ii)item based methods for collaborative filtering
- Same as in user-user similarity but on item vectors
- Find similar items to the one whose rating is missing
- E.g. For item ii compute its similarity to each other item ij
-understand how to measure user-user similarity via transformation of the Euclidean distance
- Identify like-minded users
- Measure similarity
- Method 1:
- Compute mean value for User1’s missing values
- Compute mean value for User2’s missing values
- Compute squared Euclidean distance between resulting vectors
- Compute mean (average) value for User1’s missing values (18.1)Compute mean value for User2’s missing values (14.1)
- Compute Euclidean distance between resulting rows
- Convert the distance into a similarity (high similarity for low distance, low similarity for high distance)
- Method 2:
- Compute squared Euclidean distance between vectors, summing only pairs without missing values
- Scale the result, according to percentage of pairs with a missing value
- Other:
- Correlation
- Cosine similarity
-when performing user-user similarity, understand how to select neighbors and make a prediction of the missing item
- Select neighbours & make prediction
- At runtime
- Need to select users to compare to
- Could choose the top-k most similar users
- Combining: prediction of rating is the average of the values from the top-k similar users
- Can make more efficient by computing clusters of users offline
- At runtime find nearest cluster & use the centre of the cluster as the rating prediction
- Faster but less accurate
-understand how to predict missing ratings for an item using item-item similarity.
- Search for similarities among items
- All computations can be done offline
- Item-Item similarity is more stable than user-user similarity
- No need for frequent updates:
- Find similar items to the one whose rating is missing
- Same as in user-user similarity but on item vectors
- Find similar items to the one whose rating is missing
- E.g. For item ii compute its similarity to each other item ij
- Offline phase. For each item
- Determine its k-most similar items
- Can use same type of similarity as for user-based
- Online phase:
- Predict rating raj for a given user-item pair as a weighted sum over the k-most similar items that they rated
-appreciate the difference between the online and offline phases for item based collaborative filtering
- Offline phase. For each item
- Determine its k-most similar items
- Can use same type of similarity as for user-based
- Online phase:
- Predict rating raj for a given user-item pair as a weighted sum over the k-most similar items that they rated
-the material on matrix factorisation does not need to be known