International Journal of Engineering Trends and Applications (IJETA) – Volume 3 Issue 2, Mar-Apr 2016 ISSN: 2393 - 9516 www.ijetajournal.org Page 37 RESEARCH ARTICLE OPEN ACCESS Association Rule Mining for Gene Expression Data -A Neural Network Approach Pallabi Das, Rafiqul Islam, Nirupam Saha, Sourish Mitra, Sayani Chandra Department of Computer Science and Engineering GNIT, Kolkata India ABSTRACT A systematic approach for learning and extracting rule-based knowledge from gene expression data has become an important research area. Using computation techniques such as data mining to find the association relationship among these gene data is a challenging aspect. The aim of this paper is to get the set of genes which are responsible for the expression of another particular set of genes. As we are working with biological datasets the concern is solely depended on the type of data. To evaluate the dependency of the healthy or the diseased genes we have selected this association rule mining technique which is generally used in market baskets analysis. After getting the rules we compare the support and confidence of each rule. Instead of using traditional conditional probability approach neural network is used to take the decision which rule is strongly expressed. Depending upon the type of dataset we have plotted a graph to show that by tuning the activation function of the neural network we can get the fittest rule having minimum support and minimum confidence threshold. Keywords:- Association rule, neural network, support, confidence, activation function. I. INTRODUCTION The 20th Century is frequently referred as the Century of Biology, given the huge developments of this scientific area that concluded that century with the great success of the Human Genome Project [1,2] producing the full human DNA sequencing. It is widely believed that thousands of genes and their products (i.e. RNA and proteins) in a given living organism function in a complicated and orchestrated way. However, classical methods in molecular biology generally worked on a ‘one gene in one experiment’ basis and it implies a very limited throughput so the overall picture of gene function is hard to accomplish. Global gene expression profiling, both at the transcript level and at the protein level, can be a valuable tool in the understanding of genes, biological networks, and cellular states. As larger and larger gene expression data sets become available, data mining techniques can be applied to identify patterns of interest in the data. Association rules, used widely in the area of market basket analysis, can be applied to the analysis of expression data as well. Association rules can reveal biologically relevant associations between different genes or between environmental effects and gene expression. An association rule has the form Antecedent ⇒ consequence, where Antecedent and consequence are disjoint sets of items, the consequence set being likely to occur whenever the Antecedent set occurs. Items in gene expression data can include genes that are highly expressed or repressed, as well as relevant facts describing the cellular environment of the genes. Using association rule mining approach, we can analyze: 1. The expression of one gene leads to the induction of a serial of target gene expressions. This expression pattern is denoted regulation of gene expression. The relationship between one gene and the other target genes can be viewed as an associative relation. 2. Several gene expressions lead to the expression of one target gene. Transcription factors and their target gene is one of many examples in this category (Morishita, 1999). 3. Gene expression leads to the induction of new biological function (Nakaya et al., 2000).