Association of genetic profiles to Crohn’s disease by linear combinations of single nucleotide polymorphisms Annarita D’Addabbo a , Anna Latiano b , Orazio Palmieri b , Maria Teresa Creanza a , Rosalia Maglietta a , Vito Annese b , Nicola Ancona a, * a Istituto di Studi sui Sistemi Intelligenti per l’Automazione-C.N.R., Via Amendola 122/D-I, 70126 Bari, Italy b Unita ` Operative di Gastroenterologia ed Endoscopia Digestiva, Ospedale IRCCS, ‘‘Casa Sollievo della Sofferenza’’, Viale Cappuccini, 71013 San Giovanni Rotondo (FG), Italy Received 22 January 2008; received in revised form 21 July 2008; accepted 21 July 2008 Artificial Intelligence in Medicine (2009) 46, 131—138 http://www.intl.elsevierhealth.com/journals/aiim KEYWORDS Single nucleotide polymorphisms; Crohn’s disease; Regularized least square classifiers Summary Motivations: A large number of single nucleotide polymorphisms (SNPs) are supposed to be involved in onset, differentiation and development of complex diseases. Univariate analysis is limited in studying complex traits since does not take into account gene—gene interaction, and the correlation of multiple SNPs with a specific phenotype. Moreover it might underestimate gene variants with weaker genetic contribution. Therefore more sophisticated techniques should be adopted when investigating the role of a panel of genetic markers in disease predisposition. Methods: In this paper we describe a general method to simultaneously investigate the association between SNPs profile and Crohn’s disease (CD), by evaluating the susceptibility or protective role of single or groups of markers. As an association measure we adopted a weighted linear combination of SNPs in which suitable weighting vectors belonged to predefined and over-complete vocabularies of vectors (frames), or were determined by the data. Results: The proposed method found a weighted linear combination of SNPs statis- tically associated to CD ð p ¼ 3:81 10 10 Þ describing the role of the markers in the pathology. In particular, MCP1-A2518G gave the major contribution as protective locus, similarly to TNF-aC857T, DLG5 rs124869, PTPN22 C1858T variants. The NFkB 94ATTG variants was found to be irrelevant for CD. For the remaining markers, a susceptibility role was attributed also confirming that markers on CARD15 gene, in particular G908R and L1007fsinsC, are involved with CD to the same extent as FcGIIIA * Corresponding author. Tel.: +39 080 5929428; fax: +39 080 5929460. E-mail address: ancona@ba.issia.cnr.it (N. Ancona). 0933-3657/$ — see front matter # 2008 Elsevier B.V. All rights reserved. doi:10.1016/j.artmed.2008.07.012