Integrating ‘omics’ data sets and biological knowledge: Multiple Factor Analysis as a powerful strategy Marie de Tayrac 1 , S´ ebastien Lˆ e 2 , Marc Aubry 1 , Fran¸ cois Husson 2 , and Jean Mosser 1 1 CNRS UMR 6061 G´ en´ etique et D´ eveloppement Universit´ e de Rennes 1, Groupe Oncog´ enomique 2 avenue du Pr. L´ eon Bernard - CS 34317 35043 RENNES Cedex, France (e-mail: marie.de-tayrac@univ-rennes1.fr) 2 CNRS UMR 6625 Math´ ematiques appliqu´ ees 5 rue de Saint-Brieuc - CS 84215 35042 RENNES Cedex, France (e-mail: Sebastien.Le@agrocampus-rennes.fr) Abstract. The huge amount of data provided by genome-scale technologies makes discernible biological meanings difficult to access. Here, we report a powerful inte- grative method to combine genome-wide scale data sets and biological knowledge. Multiple Factor Analysis (MFA) is used to investigate jointly large observation data sets from different ’omic’ areas enriched with biological annotations. This multi- factorial method is suitable for a wide range of biological investigations and offers a comprehensive view of the datasets structures and associated knowledge. Keywords: Multiple Factor Analysis, Genomic, Transcriptomic, Functionnal An- notation, Integrative Analysis, Gliomas. 1 Introduction High throughput technologies provide an unprecedented amount of data lead- ing to new interpretation challenges in biology. Indeed, scientists are facing a lack of strategies to identify the genes and the gene products involved in different biological processes of interest. This becomes particularly true in cancer studies where genes causative roles are sustained by high level of com- plexity. During the last few years, many efforts have been made to tackle this problem [Joyce and Palsson, 2006]. Notwithstanding, it remains difficult to obtain from such data a concise visualization of the biological mechanisms involved in the situation under study. Moreover, providing an easy access to the worldwide scientific knowledge, is another challenge. Here, we are interested in the global understanding of cancer studies using jointly two levels of investigation: the genome structure and its expression (transcriptome). Chromosomal locus copy number alterations are detected by the use of array-based Comparative Genomic Hybridization (array-CGH).