the design of the microarray. Second, local web services provide a simple user-interface for input of data into the LIMS database, utilising Perl CGI scripts interfacing with the MySQL database. The database design utilises InnoDb tables, and places strong emphasis on the use of foreign key constraints and bar-coded laboratory samples to provide consistent, unambigu- ous and efficient sample tracking throughout the microarray design, manufacture and analysis processes. By combining sample tracking procedures with BAC clone data in a single data- base, array layout and feature identification can be automated much more effectively. Finally, Perl programs provide first level analysis of image-extracted spot data, prior to data normalisa- tion and smoothing, and provide automation functions for the batch analysis and subsequent database population of processed data. Our laboratory is using GenomiPhi amplification of BAC clones from the CHORI ‘Golden Path’ 32 K BAC clone set in the manufacture of CGH-microarrays. The utilities described here have been implemented in our manufacturing and analysis processes, and will continue to be developed to provide further functionality with regard to normalisation and data smoothing algorithms. The “Array Pipeline” suite is highly portable, and would be suitable for use with any of commonly used approaches to CGH-mi- croarray manufacture and analysis. P25: Computational Aspects of Large Scale Cytogenetic Data Mining Michael Baudis University of Florida, Division of Pediatric Heamtology Over the last decades, thousands of (molecular-) cytogenetic studies have led to the descrip- tion of disease related structural and numerical abnormalities in the karyotypes (or, more gen- erally, DNA content) of tumor cells. In agreement with the multistep model of oncogenesis, repetitive chromosomal aberration patterns supposedly reflect the cooperation of different genes in most malignant diseases. A systematic analysis of these patterns for oncogenomic pathway description requires the large scale compilation of (molecular-) cyogenetic tumor data, and the development of tools for transforming those data into a format suitable for data mining purposes. The main open questions addressed through this approach would be: ● Are complex cytogenetic aberration patterns specific for certain malignancies and reflect disease specific mechanisms, or are they signs for general co-operative mechanisms during tumor de- velopment? ● Do non-random genomic aberration patterns relate to specific oncogenes and tu- mor suppressor genes relevant for the development of the corresponding tumor type? ● Are there subtypes of different tumor entities with similar genomic aberration patterns, pointing to coherent genomic changes on top of different tissue-specific “expressoms”? Although array/matrix CGH technologies have greatly improved the ability to detect small genomic abnormalities, so far those experiments have little added to the understanding of com- plex oncogenomic relationships. Additionally, the the detection of single clone abnormalities has added a new layer of complexity to the interpretation of genomic high resolution analyses. The analysis of smaller sets of CGH results by means of bioinformatics methods, e.g. from single disease entities, has successfully been used for the generation of branched aberration pathways and even clinical prediction models. In those instances a reduction of input data to the occurence of an imbalance per chromosoal arm can be performed as a good approximation. For large datasets and the generation of more general oncogenomic model systems, the precon- ceived notion of a limited set of hot spot regions or the unspecific reduction of data complex- ity will show only limited success. Here, I will review the availability and quantitative devel- M. Baudis / European Journal of Medical Genetics 48 (2005) 443–476 473