AbstractPathological changes within an organ might be reflected in proteomic patterns in serum. Mass spectrometry is becoming an important tool that generates the proteomic Patterns. Mass spectrometry yields complex functional data for which the features of scientific interest are the peaks. Due to this complexity of data, a higher order analysis such as wavelet transform is needed to uncover the differences in proteomic patterns. We have applied wavelet based feature extraction method to available data and used a filter approach to feature subset selection in order to identify the appropriate biomarkers from reconstructed mass spectra. Using different classification algorithms, our approach yielded an accuracy of 98%, specificity of 97%, and sensitivity of 100%. Keywords: Proteomics, Cancer diagnosis, Wavelet transform, Classification, Biomarker. I. INTRODUCTION HE development of tools for the early cancer diagnosis has shown to be a major problem, and clinicians have investigated a variety of diagnosis techniques. Recently, it has been found that pathological changes within an organ might be reflected in proteomic patterns in serum [1]. The word ‘proteome’ coined in 1994, designates the complete set of proteins that ultimately results from genome transcription in a given cell, tissue, or organism [2]. Hence, Proteomics is the science of making qualitative and quantitative comparisons and differentiation among proteomes under various conditions (normal vs. cancer, treated vs. untreated) to understand biological processes (disease). The field of proteomics has since evolved to include almost any type of technology that focuses upon the wide-scale analysis of proteins [3][4]. The mass spectrometry is a tool which provides information about proteins and their fragments. The definition of a mass spectrometer may seem simple: it is an instrument that can ionize a sample and measure the mass- to-charge ratio of the resulting ions. Therefore, mass spectrometer can give qualitative and quantitative information on the elemental, isotopic, and molecular composition of organic samples [5]. The mass spectrum analysis is a fast inexpensive procedure based on a sample of patient’s blood, and it may potentially allow cancer screening without any complication in time of diagnosis. Application of mass spectrometry for diagnosis of ovarian cancer could have an important effect on public health, but to achieve this goal new biomarkers are essential. For women with high risk of ovarian cancer due to family or personal history of cancer, there are no effective screening options. Ovarian cancer presents itself at a late clinical stage in more than 80% of patients, and 35% of this population survive 5-year after diagnosis [6]. On the other hand, the 5-year survival for patients with stage I ovarian cancer exceeds 90%, and increasing the number of women diagnosed with stage I should have a direct effect on the mortality and morbidity of disease. From a modeling viewpoint, the mass spectra can be considered complex functional data in which the key features of scientific interest are the peaks in the mass spectrum curve [7]. The peaks represent proteins or protein fragments (peptides) in proteomic pattern. Raw mass spectrometry data tends to be incomplete, noisy, highly correlated within the spectrum profile, highly dimensional, etc. and hence not directly suitable for feature extraction. Additionally, mass spectral data display variations in the protein profile even at identical instrumental settings and sample conditions. In order to minimize the effect of irrelevant sources of variations such as humidity, time, etc. and to be able to extract the information of mass spectra in more details, a more sophisticated preprocessing method that de-noises as well as compresses the data needs to be utilized. The wavelet transform (WT) is an effective tool for dimension reduction and noise removal in the analysis of proteomic data. Wavelets are very popular in signal processing because they are able to analyze both local and global behavior of functions. The WT is a projection of the spectrum onto an orthogonal basis, called a wavelet basis [8]. This is to say that the spectrum can be represented by a set of localized orthogonal basis functions called wavelets. Thus, wavelet analysis could provide de-noised and compressed representation of mass spectrometry data that make the feature extraction process more efficient and accurate due to many favorable properties, such as hierarchical and multiresolution decomposition structure, de-correlated coefficients, and a wide variety of orthogonal basis function possibilities. We have applied wavelet-based feature extraction method to the mass spectra of ovarian cancer patients and those of healthy people. We have used a filter approach for feature subset selection. We have employed the reconstr- ucted mass spectra to identify the appropriate biomarkers and to evaluate the classification performance. Our results have confirmed that the mass spectrometry proteomic profiles allow the diagnosis of ovarian cancer. Therefore, the wavelet-based reconstructed mass spectra can be a viable method in diagnosis of ovarian cancer. For our developed technique, the accuracy was 98% on the data sets, its specificity was 97%, and its sensitivity was 100%. T Ovarian Cancer Diagnosis Using Discrete Wavelet Transform Based Feature Extraction from Serum Proteomic Patterns H. Montazery Kordy 1 , M. H. Miranbaygi 1 , M. H. Moradi 2 1 Department of Electrical Engineering, Tarbiat Modarres University, Tehran, Iran 2 Department of Biomedical Engineering, Amirkabir University of Technology, Tehran, Iran E-mail: hmontazery@modares.ac.ir PROC. CAIRO INTERNATIONAL BIOMEDICAL ENGINEERING CONFERENCE 2006© 1