Automated Case Generation for Recommender Systems Using Knowledge Discovery Techniques Patrick Clerkin, Conor Hayes, Pádraig Cunningham Department of Computer Science Trinity College Dublin Patrick.Clerkin@cs.tcd.ie Conor.Hayes@cs.tcd.ie Padraig.Cunningham@cs.tcd.ie Abstract One approach to product recommendation in e- commerce is collaborative filtering, which is based on data of users’ consumption of assets. The alternative case-based approach is based on a more semantically rich representation of users and assets. However, gen- erating these case representations can be a significant overhead in system development. In this paper we pre- sent an approach to case authoring based on data min- ing methods. Specifically, we focus on clustering al- gorithms. Having demonstrated the feasibility of this approach, we go on to consider what benefits such techniques might confer on the recommendation sys- tem. In this context we distinguish three levels of in- terpretability of cluster formations or concepts, and go on to argue that, while the first two levels offer no immediate advantages over each other in the recom- mendation domain, moving to the third level allows us to overcome the bootstrap problem of recommending assets to new users. Introduction A key role for intelligent systems in e-commerce is product recommendation (Cunningham et al., 2001). Convention- ally, there are two alternatives to product recommendation; there is the content-based approach (case-based) and the collaborative recommendation approach. From one per- spective, these approaches are opposites since the first is representation-based and the second is representation-less. Alternatively, the difference is one of degree, where the collaborative recommendation approach is based on a raw representation of users and assets and the content-based approach is based on a more semantically rich representa- tion. (Hayes, Cunningham & Smyth, 2001). The objective in this paper is to explore the mechanisms for taking the raw structures on which collaborative recommendation is based and automatically eliciting the more semantically rich cases that can be used for content-based recommenda- tion. This is worthwhile in order to overcome the shortcomings inherent in both approaches. The content-based approach to recommendation has the disadvantage that the representa- tions on which the approach is based need to be determined at design time. On the other hand, the collaborative ap- proach has bootstrap problems where there is no basis for recommending new items and there is no basis on which to make recommendations to new users. In this paper, we propose a means of overcoming these limitations whereby the data that underpins the collabora- tive recommendation is mined to discover appropriate rep- resentations to underpin content-based recommendation. We show how clustering can be used to generate case rep- resentations that can produce good quality recommenda- tions. However these representations lack interpretability so they cannot overcome the bootstrap problem because a new user cannot use this representation to set up their profile. The paper concludes with some discussion on how this case authoring might be improved to produce case descriptions with interpretable descriptions. Recommendation As stated in the introduction, there are two approaches to recommendation on the Web. The recommendation process can be content based as represented by the upper path in Figure 1 where an appropriate representation of the assets and users requirements is determined at design time and recommendation is based on this representation. In the Case-Based Reasoning community this is referred to as case-based recommendation. The alternative lower path in the figure is automatic collaborative recommendation (ACF) which works with raw data on user’s ratings and behaviour and uses this data to produce recommendations. The focus of this paper is on how knowledge discovery techniques can be applied to this raw data to establish the appropriate representations for content-based recommenda- tion (see Figure 1). First, we will present brief descriptions of content-based and collaborative recommendation. Content Based Recommendation In the next section ACF, a representation-less recommendation process, is introduced; before that, we will describe a CBR-like content-based recommendation system that we can use for comparison purposes.