Towards an automatic characterization of criteria Benjamin Duthil 1 , Fran¸ cois Trousset 1 , Mathieu Roche 2 , G´ erard Dray 1 , Michel Planti´ e 1 , Jacky Montmain 1 , and Pascal Poncelet 2 1 EMA-LGI2P, Parc Scientifique Georges Besse, 30035 Nˆ ımes Cedex, France name.surname@mines-ales.fr 2 LIRMM CNRS 5506, 161 Rue Ada, 34392 Montpellier, France name.surname@lirmm.fr Abstract. The number of documents is growing exponentially with the rapid expansion of the Web. The new challenge for Internet users is now to rapidly find appropriate data to their requests. Thus information re- trieval, automatic classification and detection of opinions appear as ma- jor issues in our information society. Many efficient tools have already been proposed to Internet users to ease their search over the web and support them in their choices. Nowadays, users would like genuine deci- sion tools that would efficiently support them when focusing on relevant information according to specific criteria in their area of interest. In this paper, we propose a new approach for automatic characterization of such criteria. We bring out that this approach is able to automatically build a relevant lexicon for each criterion. We then show how this lexicon can be useful for documents classification or segmentation tasks. Experiments have been carried out with real datasets and show the efficiency of our proposal. Keywords: Criteria characterization, Mutual Information, Classifica- tion, Segmentation 1 Introduction With the development of web technologies, always increasing amounts of documents are available. Efficient tools are designed to help extracting relevant information. Information and Communication Technologies are thus a kernel factor in developing our modes of organisation, if not our so- cieties. Everybody has already visited recommendation sites to consult opinions of other people before choosing a movie or a e-business web- site. Automatically classifying and indexing documents are computer- ized tasks that contribute to the development of our information society. For example, lots of tools are already available to extract opinions on movies, (e.g. http://www.premiere.fr/Cinema/Critique-Film) and sup- port cinemagoers to select their movie theatre. Nevertheless, when con- sidering Figure 1 that describes a cinemagoer’s opinion with regard to movie Avatar, we may notice that the cinemagoer’s opinion regarding criterion scenario is rather negative (although the overall assessment is 9.5/10).