Learning Multicriteria Utility Functions with Random Utility Models Géraldine Bous 12 and Marc Pirlot 2 1 BIT Advanced Development, SAP, Sophia Antipolis, France geraldine.bous@sap.com 2 Dept. of Mathematics & Operations Research, Faculté Polytechnique, Université de Mons, Belgium Abstract. In traditional multicriteria decision analysis, decision maker evaluations or comparisons are considered to be error-free. In particular, algorithms like UTA*, ACUTA or UTA-GMS for learning utility func- tions to rank a set of alternatives assume that decision maker(s) are able to provide fully reliable training data in the form of e.g. pairwise preferences. In this paper we relax this assumption by attaching a like- lihood degree to each ordered pair in the training set; this likelihood degree can be interpreted as a choice probability (group decision mak- ing perspective) or, alternatively, as a degree of conﬁdence about pair- wise preferences (single decision maker perspective). Since binary choice probabilities reﬂect order relations, the former can be used to train algo- rithms for learning utility functions. We speciﬁcally address the learning of piecewise linear additive utility functions through a logistic distribu- tion; we conclude with examples and use-cases to illustrate the validity and relevance of our proposal. 1 Introduction Preference learning consists in determining a model that reﬂects the subjective value, i.e. as perceived by a decision maker (DM), of alternatives or items be- longing to a set S (the reader can refer to Fürnkranz & Hüllermeier, 2011, for a general introduction on the topic). In artiﬁcial intelligence and decision theory, this problem is frequently solved by learning a value or utility function u such that the order obtained by ranking the alternatives by order of decreasing utility corresponds to the order induced on S by the preferences of the DM. Typically, the preference relation on S is not entirely known; therefore, it is common prac- tice to obtain a sample of the preference relation on a subset S L ⊂ S – the learning set – to train the utility model. The thereby obtained utility function can then be used to evaluate alternatives in S \ S L and to obtain an estimate of the preference relation on S as a whole, thereby allowing to solve ranking or choice problems on S. In multicriteria decision theory, alternatives are characterized by their per- formances of several criteria; in this context, the preference learning problem aims at producing a utility function that evaluates items as a function of their