Indonesian Journal of Electrical Engineering and Computer Science Vol. 13, No. 1, January 2019, pp. 405~410 ISSN: 2502-4752, DOI: 10.11591/ijeecs.v13.i1.pp405-410 405 Journal homepage: http://iaescore.com/journals/index.php/ijeecs Classification enhancement of breast cancer histopathological image using penalized logistic regression Mohammed Abdulrazaq Kahya Department of Computer science, Education College for Pure Science, University of Mosul, Mosul, Iraq Article Info ABSTRACT Article history: Received Jun 19, 2018 Revised Aug 21, 2018 Accepted Nov 18, 2018 Classification of breast cancer histopathological images plays a significant role in computer-aided diagnosis system. Features matrix was extracted in order to classify those images and they may contain outlier values adversely that affect the classification performance. Smoothing of features matrix has been proved to be an effective way to improve the classification result via eliminating of outlier values. In this paper, an adaptive penalized logistic regression is proposed, with the aim of smoothing features and provides high classification accuracy of histopathological images, by combining the penalized logistic regression with the smoothed features matrix. Experimental results based on a publicly recent breast cancer histopathological image datasets show that the proposed method significantly outperforms penalized logistic regression in terms of classification accuracy and area under the curve. Thus, the proposed method can be useful for histopathological images classification and other classification of diseases types using DNA gene expression data in the real clinical practice. Keywords: Breast cancer Histopathological image L1-norm Penalized logistic regression Smoothing Copyright © 2019 Institute of Advanced Engineering and Science. All rights reserved. Corresponding Author: Mohammed Abdulrazaq Kahya, Department of Computer science, Education College for Pure Science, University of Mosul, Mosul, Iraq. Email: mohammedkahya@uomosul.edu.iq 1. INTRODUCTION Nowadays, cancer is the second leading cause of death worldwide. On the other hand, the World Health Organization (WHO) confirmed that 8.2 million deaths were caused by cancer in 2012 and 8.8 million in 2015. Moreover, it expected 27 million of new cases of this disease before 2030 [1]. In particular, breast cancer is one of the leading causes of women's death in the world. A recent study confirmed that breast cancer accounts for 18% of all types of women cancers and the fifth reason of death in the worldwide [2]. However, the early stage diagnosis and therapy can increase the survival rates to 98% [3]. There are many noninvasive imaging techniques for breast cancer such as magnetic resonance imaging (MRI), mammograms (X-rays), ultrasonography and histopathological image [4-7]. Diagnosis using histological images has become a powerful gold standard for deadly diseases such as breast and lung cancers, which gives a satisfactory diagnosis compared with other methods such as mammography and ultrasonography [8]. On the other hand, machine learning techniques have been used to enhance the diagnostic accuracy for breast cancer through a computer-assisted system [9]. In general, breast cancer is classified into benign and malignant types and this diagnosis is very important in drug discovery and treatment [10-11]. Logistic regression (LR) is considered one of the famous machine learning techniques of classification such as support vector machines (SVM), random forests (RF), and neural networks (NNet) [12]. Logistic regression is an extensive classification technique and has many applied fields like gene expression data [13], prediction of therapy outcome [14] and protein function [15].