IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, VOL. 19, NO. 2, MARCH2015 709 A Multiscale Approach for Modeling Atherosclerosis Progression Konstantinos P. Exarchos, Clara Carpegianni, Georgios Rigas, Themis P. Exarchos, Member, IEEE, Federico Vozzi, Antonis Sakellarios, Paolo Marraccini, Katerina Naka, Lambros Michalis, Oberdan Parodi, and Dimitrios I. Fotiadis, Senior Member, IEEE > Abstract—Progression of atherosclerotic process constitutes a serious and quite common condition due to accumulation of fatty materials in the arterial wall, consequently posing serious cardio- vascular complications. In this paper, we assemble and analyze a multitude of heterogeneous data in order to model the progression of atherosclerosis (ATS) in coronary vessels. The patient’s medi- cal record, biochemical analytes, monocyte information, adhesion molecules, and therapy-related data comprise the input for the sub- sequent analysis. As indicator of coronary lesion progression, two consecutive coronary computed tomography angiographies have been evaluated in the same patient. To this end, a set of 39 patients is studied using a twofold approach, namely, baseline analysis and temporal analysis. The former approach employs baseline infor- mation in order to predict the future state of the patient (in terms of progression of ATS). The latter is based on an approach en- compassing dynamic Bayesian networks whereby snapshots of the patient’s status over the follow-up are analyzed in order to model the evolvement of ATS, taking into account the temporal dimension of the disease. The quantitative assessment of our work has resulted in 93.3% accuracy for the case of baseline analysis, and 83% over- all accuracy for the temporal analysis, in terms of modeling and predicting the evolvement of ATS. It should be noted that the ap- plication of the SMOTE algorithm for handling class imbalance and the subsequent evaluation procedure might have introduced an overestimation of the performance metrics, due to the employ- ment of synthesized instances. The most prominent features found to play a substantial role in the progression of the disease are: dia- betes, cholesterol and cholesterol/HDL. Among novel markers, the CD11b marker of leukocyte integrin complex is associated with coronary plaque progression. Index Terms—Atherosclerosis (ATS) progression, classification, dynamic Bayesian networks. Manuscript received September 3, 2013; revised April 28, 2014, February 26, 2014, and December 23, 2013; accepted November 17, 2013. Date of publication May 14, 2014; date of current version March 2, 2015. This work was supported in part by the European Commission (Project ARTREAT: Multi-level Patient- Specific Artery and Atherogenesis Model for Outcome Prediction, Decision Support Treatment, and Virtual Hand-on Training, FP7-224297). K. P. Exarchos, G. Rigas, T. P. Exarchos, A. Sakellarios, and D. I. Fo- tiadis are with the Department of Materials Science and Engineering, Unit of Medical Technology and Intelligent Information Systems, University of Ioan- nina, GR 45110 Ioannina, Greece, and also with the Foundation for Research and Technology - Hellas, Institute of Molecular Biology and Biotechnology, Department of Biomedical Research, GR 45110, Ioannina, Greece (e-mail: kexarcho@gmail.com; rigas@cs.uoi.gr; exarchos@cc.uoi.gr; ansakel@cc. uoi.gr; fotiadis@cc.uoi.gr). C. Carpegianni, F. Vozzi, P. Maraccini, and O. Parodi are with the Institute of Clinical Physiology, National Research Council, 56124 Pisa, Italy (e-mail: clara@ifc.cnr.it; vozzi@ifc.cnr.it; paolo.marraccini@ifc.cnr.it; oberdan.parodi@virgilio.it). K. Naka and L. Michalis are with the Department of Cardiology, Med- ical School, University of Ioannina, GR 45110, Ioannina, Greece (e-mail: anaka@cc.uoi.gr; lmihalis@cc.uoi.gr). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/JBHI.2014.2323935 I. INTRODUCTION A THEROSCLEROSIS (ATS) is a pathological condition affecting the arterial wall and is responsible for unfavor- able clinical manifestations and mortality. ATS as a disease and its pathophysiology have been elsewhere laid out in detail [1]; therefore, we will only briefly go through some key points. The pathogenetic process of ATS encompasses three main processes: 1) endothelial dysfunction; 2) lipid plaque formation; and 3) atheromatic plaque formation. To this end, several risk factors have been identified to affect the progression of the aforementioned processes, and hence the ATS as a whole. Traditional risk factors are diabetes, family history, dyslipidemia, hypertension, smoking habits, age, and sex [2], [3]. In the literature, there are several studies attempting to corre- late patient phenotype (e.g., age, smoking, ankle–brachial pres- sure index, etc.) and the development of coronary artery disease (CAD), commonly measured using the result of the coronary angiography procedure [4], [5]. The majority of these studies focus on computing correlations between single features and the angiography outcome. Moreover, patients with intermedi- ate risk for CAD are assessed with tools such as the popular Framingham scoring system [6] or similar risk stratification tools [7], [8]. To this end, a similar score has been recently developed within the HEARTCYCLE project [9]. There is also a body of research more closely related to the objectives of our study, i.e., the use of data mining techniques for the development of decision support systems predicting the outcome of coronary angiography and progression of the ATS. A recent data mining study based on a dataset of about 200 patient records [10] has the objective to classify the grade of coronary ATS not only on the base of the commonly used fea- tures (age, sex, family history, blood tests, blood pressure, etc.), but also on the results of pulse wave velocity. Various methods are tested for building the classifiers and one based on decision trees and fuzzy modeling seems to provide the most accurate results (73% accuracy). Another interesting data mining study is based on a dataset of 655 patients and 202 features. The dataset contains for each patient a detailed description of the stenosis of each of the four arteries. The study employs decision trees and association rules to develop a decision support system for predicting the stenosis of each individual artery [11], [12]. De- cision support systems have also been primarily used for the diagnosis of CAD by utilizing several sources of information, such as ECG signals [13], single photon emission computed 2168-2194 © 2014 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications standards/publications/rights/index.html for more information.