Interleukin-1 gene complex single nucleotide polymorphisms in systemic sclerosis: A further step ahead Lorenzo Beretta a, *, Francesca Cappiello a , Jason H. Moore b , Raffaella Scorza a a Referral Center for Systemic Autoimmune Diseases, IRCCS Fondazione Policlinico-Mangiagalli-Regina Elena and University of Milan, Milan, Italy b Computational Genetics Laboratory, Departments of Genetics and Community and Family Medicine, Dartmouth Medical School, Lebanon, NH, USA Received 1 October 2007; received in revised form 5 December 2007; accepted 19 December 2007 Summary Gene– gene and gene– environment interactions are difficult to detect by traditional parametric computational approaches. Novel nonparametric and model-free strategies, such as the multifactor dimensionality reduction (MDR) algorithm, are thus emerging as practical and feasible methods of analysis to model high-order epistatic interactions, integrating and com- plementing traditional logistic approaches. With traditional methods of analysis we showed that the interleukin-1(IL-1)C+3962T single nucleotide polymorphism (SNP), along with the Sc70 antibody and the diffuse cutaneous subset of systemic sclerosis, are important risk factors for the development of a severe ventilatory restriction in patients with systemic sclerosis (SSc); however the interactions among these and other genetic and environmental attributes were difficult to model. On the contrary, the MDR analysis detected significant two- or three-way interactions in the pres- ence of nonlinearity. The best model identified by the multifactor dimensionality reduction algo- rithm included the antibody subset, the IL-1C-511T and the interferon-AUTR5644T SNPs, with a testing accuracy of 85% (p 0.001) and a cross-validation consistency of 10/10. This model outper- formed any one- to-three-way model constructed by considering the three factors with main inde- pendent effects identified by traditional computational approaches. Epistatic interactions among IL-1 gene complex SNPs and clinical or environmental factors are more important than the singe attributes in the development of severe ventilatory restriction in SSc patients. © 2008 American Society for Histocompatibility and Immunogenetics. Published by Elsevier Inc. All rights reserved. KEYWORDS Systemic sclerosis; Single nucleotide polymorphism; Lung fibrosis; Epistasis Introduction Complex relationships among multiple genetic or genetic and environmental variables are sometimes difficult to model by traditional parametric statistical approaches, such as logistic regression, because of the sparseness of data that may inflate both type I and type II errors [1]. This problem is related to the high number of empty contingency cells gen- erated by the interaction of multiple variables and has been termed by Bellman “the curse of dimensionality” [2]. Even in datasets with sufficiently large sample size, multiple inter- actions among variables typical of complex human diseases may not be detected with traditional methods, given their * Corresponding author. Fax: +39 02 55035289. E-mail address: lorberimm@hotmail.com (L. Beretta). Human Immunology (2008) 69, 187–192 0198-8859/$ -see front matter © 2008 American Society for Histocompatibility and Immunogenetics. Published by Elsevier Inc. All rights reserved. doi:10.1016/j.humimm.2007.12.006