A recommendation system is a software program which attempts to narrow down selections for users based on their expressed preferences, past behavior, or other data which can be mined about the user or other users with similar interests.
Recommendation systems have their roots in "Usenet," a worldwide distributed discussion system originating at Duke University in the late 1970s. Usenet operated in a client/server format, allowing user input that was categorized into specific "newsgroups." In Usenet, the posts made by users are categorized into these newsgroups, which are then further divided into sub-categories, if needed.
Information Filtering (IF) is a way of sifting through the overabundance of data on the Web. As newsgroups grew exponentially, database administrators were scrambling for a way to reduce e-clutter. Some of the early solutions for data overload include:
Pattie Maes was primarily responsible for collaborative filtering with the advent of her efforts at MIT on a system called Firefly, a recommendation system for music lovers. Firefly was later purchased by Microsoft for an estimated 40 million dollars.
Through the 1990s and beyond, collaborative filtering recommendation systems included:
Recommendation systems are now an integral part of Amazon.com's purchasing power!
The current generation of recommendation methods can be broadly classifed into the following five categories, based on the knowledge sources they use to make recommendations:
General Requirements for Recommendation Systems
To make a viable recommendation, three things are needed:
Content-based systems recommended items to the user similar to the ones he or she preferred in the past. For example, in a book recommendation application, in order to recommend books to user u, the content-based recommendation system looks for the similarities among the books user u has rated highly in the past (specific writers, genres, subject matter). Only the books that have a high degree of similarity to whatever the user’s preferences are would be recommended.
Content-based Systems are designed mostly to recommend text-based items. The preferences that get evaluated are called “keywords.”
Content-based recommendations can either be:
How Content-based Recommendation Works
The user profile of preferences is stored as a vector of keywords. These profiles are obtained by analyzing the content of the items previously seen and rated by the user and are usually constructed using keyword analysis techniques from information retrieval.
Information retrieval involves allocating various weights to keywords by use of algorithms such as Winnow and Rocchio algorithms.
In the model implementation of content-based RS, other techniques of information retrieval such as Bayesian classifiers and various machine learning techniques, including clustering, decision trees, and artificial neural networks are used.
Limitations of Content-based Recommendation Systems
This is a term used to describe a well-known issue with the Content-based systems and recommendation systems in general. For example, new items cannot be recommended to any user until they get some sort of rating. Recommendations for items that are new to the database are essentially relatively weaker than more widely rated products, and this is the same case for users who are new to the system. In other words, until there is a large number of users whose habits are known, the system cannot be useful for most users, and until a sufficient number of rated items has been collected, the system cannot be useful for a particular user.
Once a user’s profile has been set up in the system, it is hard for the system to adapt to changes in the user’s preferences. An alcoholic who becomes a teetotaler will continue to get alcohol recommendations from a content-based or collaborative recommender for some time, until newer ratings kick in.
Collaborative recommendation systems suggest items that people with similar taste preferred in the past. See also: Collaborative filtering
Knowledge-based recommendation utilizes the knowledge about users and products and reasons out what products meet the users requirements. Some of the systems being used at present effectively walk the user down a discrimination tree of product attributes whereas others have adopted a quantitative decision support tool for this task.
Advantages of Knowledge-based Recommendation It doesn't have the “ramp-up” problem since its recommendations don’t depend on any database of user ratings. Users are encouraged to explore and understand the information space and, by doing so, they elaborate more on their needs.
Disadvantages of Knowledge-based Recommendation It requires an engineered knowledge database to make useful recommendations. This knowledge base has to be updated to keep up with the ever-changing consumer ratings and preferences. This system tends to give static suggestions that limit the user to what is contained in the database.
Categorizes the user based on personal attributes and makes recommendations based on demographic classes, e.g. college students, teenagers, women, men, etc. The advantages and disadvantages of this system are similar to those of Knowledge-based Recommendation Systems.
All the above mentioned systems have complementary strengths and weaknesses. A Hybrid recommendation system combines two or more recommendation techniques to gain better system optimization and fewer of the weaknesses of any individual ones. The most popular Hybrids are those of Content-based and Collaborative Filtering.
Methods/Strategies of Hybridization
There are different strategies by which hybridization can be achieved and they are broadly classified into seven categories:
The five most challenging issues recommendation systems face are:
More RS Issues
The future of recommendation systems is unclear. Options discussed include:
Recommendation systems are also being targeted to the following industries:
Wired.com recently released an article on Caterina Fake and her work with Hunch.com especially with respect to the cold start problem.[1]