1 Registering High Resolution Microscopic Images with Different Histochemical Stainings - A Tool for Mapping Gene Expression with Cellular Structures Lee Cooper 1 , Shan Naidu 2 , Gustavo Leone 2 , Joel Saltz 1 , Kun Huang 1 1 Department of Biomedical Informatics, Ohio State University, Columbus, USA 2 The Comprehensive Cancer Center, Ohio State University, Columbus, USA Abstract— One of the key problems in system biology is to develop realistic models for interaction between different types of cells with high resolution spatial information. Large scale microscopic imaging is an essential tool for this purpose since it can capture both cellular distribution as well as gene expression information. However, a difficult issue is how to integrate the information for cellular and molecular distributions which are usually obtained using different types of staining techniques on different histo- logical sections. This problem is particularly challenging due to the large size of the microscopic images (usually in the size of several gigabytes per image). In order to solve this problem, we present an image registration workflow aligning microscopic images with different types of stain- ing. This workflow contains three stages: rigid registration, nonrigid registration, and multiresolution refinement. The focus of this paper is to develop an efficient and scalable algorithm to obtain precise nonrigid registration results. We first established that the sharpness of the normalized cross-correlation function can be used as a similarity measure for matching corresponding points between two images. This helps us avoid the high computational cost in computing other measures such as mutual information, which is critical for processing images this large. The matched points are used as control points to compute nonrigid transformation between the two images. In order to improve the matching accuracy, we adopt a multiple resolution approach for accurately matching key regions of interests. We tested this algorithm using real histological images of mouse mammary gland samples by focusing the mammary gland duct which is the potential site for breast tumor initiation. The results show that our algorithm is highly accurate and will be applicable to a large scale gene expression mapping studies on breast tumor microenvironment. I. I NTRODUCTION One of the key problems in the post genomic era is to understand the regulation of gene expressions in organisms. Proteomics techniques such as genechips Corresponding author: Lee Cooper, email: cooperle@gmail.com. (microarray) and mass spectroscopy have provided a tremendous amount of information on gene expression patterns, however in most experiments these techniques are applied to biological samples that contain a diverse population of cells and therefore reflect “averaged” ex- pression profiles. In contrast gene expression profiles in different types of cells can be drastically different and studies show that even the same type of cell in the same tissue microenvironment can exhibit heterogeneity in the expression levels of key proteins [15]. Therefore, the capability to map gene expression to individual cells is essential to explore gene regulation within tissue environments at the cellular level. In this paper we present a method for mapping gene expression based on registration of high-resolution mi- croscopic images with distinct staining types. For in- stance, given two serial sections from a mouse mammary gland one section is stained to identify cellular structure using specific immunohistochemical staining and the other section is stained for an important tumor suppressor gene PTEN. By registering these two section images we obtain the mapping of expression of PTEN in features of interest such as fibroblasts and epithelial cells. To accomplish registration at the precision necessary for expression mapping three challenges need to be ad- dressed: comparison across multiple modalities, nonrigid deformation and change in morphology between the two images, and the large size of histological images obtained at high magnification. In this paper we address these challenges with the following approaches: 1) Design a new similarity measure for matching feature points between two images with different staining. The goal of image registration is to de- termine a transformation that maximizes the sim- ilarity between two images. Mutual information (MI) and normalized cross correlation (NCC) are commonly used as similarity measures in registra- tion. However, we found that for matching image