Abstract—Pathological changes within an organ might be
reflected in proteomic patterns in serum. Mass spectrometry
is becoming an important tool that generates the proteomic
Patterns. Mass spectrometry yields complex functional data
for which the features of scientific interest are the peaks. Due
to this complexity of data, a higher order analysis such as
wavelet transform is needed to uncover the differences in
proteomic patterns. We have applied wavelet based feature
extraction method to available data and used a filter
approach to feature subset selection in order to identify the
appropriate biomarkers from reconstructed mass spectra.
Using different classification algorithms, our approach
yielded an accuracy of 98%, specificity of 97%, and sensitivity
of 100%.
Keywords: Proteomics, Cancer diagnosis, Wavelet
transform, Classification, Biomarker.
I. INTRODUCTION
HE development of tools for the early cancer
diagnosis has shown to be a major problem, and
clinicians have investigated a variety of diagnosis
techniques. Recently, it has been found that pathological
changes within an organ might be reflected in proteomic
patterns in serum [1]. The word ‘proteome’ coined in 1994,
designates the complete set of proteins that ultimately
results from genome transcription in a given cell, tissue, or
organism [2]. Hence, Proteomics is the science of making
qualitative and quantitative comparisons and differentiation
among proteomes under various conditions (normal vs.
cancer, treated vs. untreated) to understand biological
processes (disease). The field of proteomics has since
evolved to include almost any type of technology that
focuses upon the wide-scale analysis of proteins [3][4].
The mass spectrometry is a tool which provides
information about proteins and their fragments. The
definition of a mass spectrometer may seem simple: it is an
instrument that can ionize a sample and measure the mass-
to-charge ratio of the resulting ions. Therefore, mass
spectrometer can give qualitative and quantitative
information on the elemental, isotopic, and molecular
composition of organic samples [5]. The mass spectrum
analysis is a fast inexpensive procedure based on a sample
of patient’s blood, and it may potentially allow cancer
screening without any complication in time of diagnosis.
Application of mass spectrometry for diagnosis of
ovarian cancer could have an important effect on public
health, but to achieve this goal new biomarkers are
essential. For women with high risk of ovarian cancer due
to family or personal history of cancer, there are no
effective screening options. Ovarian cancer presents itself
at a late clinical stage in more than 80% of patients, and
35% of this population survive 5-year after diagnosis [6].
On the other hand, the 5-year survival for patients with
stage I ovarian cancer exceeds 90%, and increasing the
number of women diagnosed with stage I should have a
direct effect on the mortality and morbidity of disease.
From a modeling viewpoint, the mass spectra can be
considered complex functional data in which the key
features of scientific interest are the peaks in the mass
spectrum curve [7]. The peaks represent proteins or protein
fragments (peptides) in proteomic pattern. Raw mass
spectrometry data tends to be incomplete, noisy, highly
correlated within the spectrum profile, highly dimensional,
etc. and hence not directly suitable for feature extraction.
Additionally, mass spectral data display variations in the
protein profile even at identical instrumental settings and
sample conditions. In order to minimize the effect of
irrelevant sources of variations such as humidity, time, etc.
and to be able to extract the information of mass spectra in
more details, a more sophisticated preprocessing method
that de-noises as well as compresses the data needs to be
utilized.
The wavelet transform (WT) is an effective tool for
dimension reduction and noise removal in the analysis of
proteomic data. Wavelets are very popular in signal
processing because they are able to analyze both local and
global behavior of functions. The WT is a projection of the
spectrum onto an orthogonal basis, called a wavelet basis
[8]. This is to say that the spectrum can be represented by a
set of localized orthogonal basis functions called wavelets.
Thus, wavelet analysis could provide de-noised and
compressed representation of mass spectrometry data that
make the feature extraction process more efficient and
accurate due to many favorable properties, such as
hierarchical and multiresolution decomposition structure,
de-correlated coefficients, and a wide variety of orthogonal
basis function possibilities.
We have applied wavelet-based feature extraction
method to the mass spectra of ovarian cancer patients and
those of healthy people. We have used a filter approach for
feature subset selection. We have employed the reconstr-
ucted mass spectra to identify the appropriate biomarkers
and to evaluate the classification performance. Our results
have confirmed that the mass spectrometry proteomic
profiles allow the diagnosis of ovarian cancer. Therefore,
the wavelet-based reconstructed mass spectra can be a
viable method in diagnosis of ovarian cancer. For our
developed technique, the accuracy was 98% on the data
sets, its specificity was 97%, and its sensitivity was 100%.
T
Ovarian Cancer Diagnosis Using Discrete Wavelet Transform Based Feature
Extraction from Serum Proteomic Patterns
H. Montazery Kordy
1
, M. H. Miranbaygi
1
, M. H. Moradi
2
1
Department of Electrical Engineering, Tarbiat Modarres University, Tehran, Iran
2
Department of Biomedical Engineering, Amirkabir University of Technology, Tehran, Iran
E-mail: hmontazery@modares.ac.ir
PROC. CAIRO INTERNATIONAL BIOMEDICAL ENGINEERING CONFERENCE 2006© 1