IJCSNS International Journal of Computer Science and Network Security, VOL.19 No.12, December 2019 197 Manuscript received December 5, 2019 Manuscript revised December 20, 2019 A Proposed Decision Tree Classifier for Atherosclerosis Prediction and Classification Yousef K Qawqzeh 1 , Mohammad Mahmood Otoom 1 , Fayez Al-Fayez 1 , Ibrahim Almarashdeh 2 , Mutasem Alsmadi 2 , Ghaith Jaradat 3 1 Computer science department, College of Science, Majmaah University, Saudi Arabia 2 MIS Dept., College of Applied Studies and Community Service, Imam Abdurrahman Bin Faisal University, Al-Dammam, Saudi Arabia 3 Department of Computer Science, Faculty of Computer Science and Information Technology, Jerash University, Jordan Abstract Cardiovascular diseases (CVD) represent a big threat to human lives. As most of CVD symptoms are developed silently inside our cardiovascular system, the prediction of the disease before it comes to threaten human's life, represents an appreciated achievement. The tracking of the silent development of atherosclerosis inside arteries may yield to new methods for early detection and prevention of CVD. Atherosclerosis is one of the main causes of CVD. The more silent atherosclerosis is, the more difficult to be detected. It represents a chronic disease that causes arterial wall to be stiffen. Normally, people are not visiting a diagnostic center nor consulting their doctor, unless the risk reaches high level. This study utilized features extracted from photoplethysmogram (PPG) for tracking and evaluating the high- risk atherosclerosis. A sample of 196 participants are enrolled in this study. Their carotid intima-media thickness (CIMT) test were recorded. The PPG's indices along with Age index are fed to a decision tree classifier developed in MATLAB to predict and classify new data into high-risk atherosclerosis or normal atherosclerosis. The developed classifier showed promising results in which it revealed an overall accuracy of 82.6%. Additionally, it showed a sensitivity of 89.3% and specificity of 69.2%. These results represent a new possible method to be valid surrogate measure for atherosclerosis along with the used CIMT test. Key words: Atherosclerosis; Photoplethysmogram; Classification; Decision tree; Prediction. 1. Introduction Cardiovascular diseases (CVD) represent a big threaten to human lives. As most of CVD symptoms are developed silently inside our cardiovascular system, the prediction of the disease before threatens human's life, will be an appreciated advancement. The tracking of the silent development of atherosclerosis inside arteries may yield to new methods for early detection and prevention of CVD. Atherosclerosis is one of the main causes of CVD. The more silent atherosclerosis is, the more difficult to be detected. It represents a chronic disease that causes arterial wall to be stiffen. Normally, people are not visiting a diagnostic center nor consulting their doctor, unless the risk reaches high level. This study utilized features extracted from photoplethysmogram (PPG) for tracking and evaluating the high-risk atherosclerosis. PPG is an optical volumetric measure of an organ. It can be defined as blood volume changes inside arteries (Qawqzeh et al., 2015). A sample of 196 participants were enrolled in this study in which their carotid intima-media thickness (CIMT) test and their PPG data were recorded. Table 1 below illustrates the descriptive analysis of participants. The strategies of classification are widely utilized in clinical settings for predicting patient's health status. This work implements a decision tree method to predict and track atherosclerosis accumulation. Several comparative studies showed that decision tree classifiers are simple and accurate (Latha & Jeeva, 2019). The proposed prediction model is constructed using decision tree method in MATLAB environment. 1.1 Data collection methods This section provides a brief detail about PPG data and CIMT test recordings. A customized PPG setup is used to record PPG data from each participant in a temperature- controlled room ±25o inside an equipped hospital room (General Hospital of Zulfi, Riyadh, KSA). Subjects are asked to be quite for 3 minutes to allow cardiovascular stabilization. A PPG probe is then applied to the right-hand index finger. The patient is asked to remain quiet and breathe normally. PPG recording, for each subject, ran for 2 minutes. Pre-processing such as down-sampling, and de- trending to remove outliers and drifts have taken place. PPG signals are filtered using the band-pass filter (0.615 Hz) any respiratory rhythm and higher frequency disturbances. Finally, the extracted features are saved in an Excel sheet for further possible analysis. The following four indices, b/a, RI, SPt, & DiP, were very significant with CIMT test. In addition, 'Age' index was also very significant. The CIMT data is collected using carotid duplex ultrasound scanner that contains a screen for video display, computer console, and transducer (probe). The recordings were