ORIGINAL ARTICLE Stable gene selection by self-representation method in fuzzy sample classification Armaghan Davoudi 1 & Hamid Mahmoodian 1,2 Received: 11 July 2019 /Accepted: 12 March 2020 # International Federation for Medical and Biological Engineering 2020 Abstract In recent years, microarray technology and gene expression profiles have been widely used to detect, predict, or classify the samples of various diseases. The presence of large genes in these profiles and the small number of samples are known challenges in this field and are widely considered in previous papers. In previous studies, other topics such as the noise of microarray data or the dependence of selected genes on samples have been less considered. Therefore, we have tried to address these two issues by using a fuzzy classifier and stability index of selected genes, respectively. The proposed method is based on the regression function between the genes and class labels which is determined by the self-representing method. This regression function is determined individually for each class of the database. To minimize the effect of noise in microarray data, a fuzzy classifier is applied in the proposed model. Four databases of gene expression profiles are examined in this article, and the results indicate that the proposed model has a relative advantage over the previous methods. Keywords Stable gene selection . Self-representation . Fuzzy classifier 1 Introduction In recent years, the use of gene expression profiles for the diagnosis and prognosis of diseases has grown dramatically. Large dimensions and a large number of genes in these matrices are two important factors in creating prediction models. Many studies have been done on this subject in recent years [1, 2].Therefore, separation of effective genes is necessary not only to create effective models but also to identify biological behav- iors. An important issue in the separation of these genes, which is not usually considered in previous papers, is the degree of stability of the selected genes. The concept of stability in this area means the degree of dependence of the selected genes on the samples under study or on the methods of feature selection. On the other hand, because the expression values of the gene are determined using microarray equipment and based on the processing of color images, these values can be noisy. Therefore, determining the accuracy of the models against the noise-induced changes in the expression values could be an important issue that has been less widely seen in previous studies. In general, gene selection methods are classified into two general categories of filter and wrapper. In the filter method, the genes score is determined independently of the classifier, and then a set of genes with a high score is selected. Some of the commonly used methods in this category are presented in [3–7]. Most of the methods in this category are based on statis- tical parameters and are highly dependent on data of gene ex- pression values. Therefore, in filter-based methods, the sensi- tivity of the models increases to noise in this data. To reduce the number of duplicate genes, which is a common problem in this category, these methods are mostly developed and maximum relevance minimum redundancy methods are presented [8]. The gene selection methods, which are used to separate the samples based on their involvement with a particular classifi- er, are categorized into the wrapper. Support vector machine- recursive feature elimination (SVM-RFE) [7] is one of the well-known wrapper methods. In addition to the above categories, self-representation methods are referred to as methods that are based on a linear * Hamid Mahmoodian H_mahmoodian@pel.iaun.ac.ir Armaghan Davoudi Armaghan1989davoudi@gmail.com 1 Electrical Engineering Faculty, Najafabad Branch, Islamic Azad University, Najafabad, Iran 2 Digital Processing and Machine Vision Research Center, Najafabad Branch, Islamic Azad University, Najafabad, Iran Medical & Biological Engineering & Computing https://doi.org/10.1007/s11517-020-02160-6