Fun at a Department Store: Data Mining Meets Switching Theory Anna Bernasconi 1 Valentina Ciriani 2 Fabrizio Luccio 1 Linda Pagli 1 1 Dipartimento di Informatica 2 Dipartimento di Tecnologie dell’Informazione Universit`a di Pisa Universit`a degli Studi di Milano 56127 Pisa, Italy 26013 Crema, Italy {annab,luccio, pagli}@di.unipi.it valentina.ciriani@unimi.it Abstract In this paper we introduce new algebraic forms, SOP + and DSOP + , to represent functions f : {0, 1} n → N, based on arithmetic sums of products. These expressions are a direct generalization of the classical SOP and DSOP forms. We propose optimal and heuristic algorithms for minimal SOP + and DSOP + synthesis. We then show how the DSOP + form can be exploited for Data Mining applications. In particular we propose a new compact representation for the database of transactions to be used by the LCM algorithms for mining frequent closed itemsets. Keywords: SOP, Implicants, Data Mining, Frequent Itemsets, Blulife. Consider a department store with a very good body care department. Among many products, the following are on demand: a - Algesiv: adhesive for dental plates, b - Blulife: spray for breath with a floral scent, c - Crinagen: lotion against hair loss, d - Deocontrol: drops for feet odor control, e - Earbeauty: spoon for taking out earwax, f - Fleastop: powder against flea invasion, g - Gluttonase: tablets for stomachache, h - Haltmuc: tampon for nasal mucus, i - Itchand: plastic hand for back scratching, j - Johnheaven: toilet deodorant. 1 Now, take a look at the customers’ baskets (called transactions in the following). No wonder that most customers buying b also buy g and occasionally j ; or that the ones buying f quite often buy i as well. It is also understandable that several people tend to buy b, d, and e together, and is not uncommon to find a and c in the same basket, where, on the other hand, e seldom appears. But seems to be a mystery why no-one buying h also buys c. Studying associations among the items occurring jointly in a set of transactions is of paramount importance for conducting business of many kinds. Data mining is the main area in which such studies have been developed, where knowledge of the phenomena involved and related computational methods have reached maturity through a wealth of significant publications, e.g. see [1, 9, 6], or the comprehensive bibliography of [3]. A point 1 Some of these products are actually on the market, some are due to the authors’ fantasy. Just guess. 1