Instance Selection Techniques for Memory-Based Collaborative Filtering

conference paper
Published by Society for Industrial & Applied Mathematics (SIAM)

https://doi.org/10.1137/1.9781611972726.4

Abstract

Collaborative filtering (CF) has become an important data mining technique to make personalized recommendations for books, web pages or movies, etc. One popular algorithm is the memory-based collaborative filtering, which predicts a user's preference based on his or her similarity to other users (instances) in the database. However, the tremendous growth of users and the large number of products, memory-based CF algorithms results in the problem of deciding the right instances to use during prediction, in order to reduce executive cost and excessive storage, and possibly to improve the generalization accuracy by avoiding noise and overfitting. In this paper, we focus our work on a typical user preference database that contains many missing values, and propose four novel instance reduction techniques called TURF1-TURF4 as a preprocessing step to improve the efficiency and accuracy of the memory-based CF algorithm. The key idea is to generate prediction from a carefully selected set of relevant instances. We evaluate the techniques on the well-known EachMovie data set. Our experiments showed that the proposed algorithms not just dramatically speed up the prediction, but also improved the accuracy.

Keywords

This publication has 0 references indexed in Scilit: