157 International Journal of Intelligent Engineering and Systems, Vol.9, No.4, 2016 DOI: 10.22266/ijies2016.1231.17 FDSMO: Frequent DNA Sequence Mining Using FBSB and Optimization Kuruva Lakshmanna 1* Neelu Khare 2 1 VIT University, Vellore, Tamil Nadu, India 2 VIT University, Vellore, Tamil Nadu, India * neelu.khare@vit.ac.in Abstract: DNA Sequence mining helps in discovering the patterns which can occur frequently, structures of DNA in DNA data sets. Frequent pattern mining is a central strategy for affiliation guideline discovery, but existing calculations experience the ill effects of low effectiveness or poor error rate on the grounds that natural groupings vary from general successions with more attributes. In our last work, we proposed Prefix Span with Group Search Optimization (PSGSO) to optimize the mined results from the Prefix Span method. We propose a new method called Frequent DNA Sequence Mining using Optimization (FDSMO) which combines Frequent Biological Sequence based on Bitmap (FBSB) and Hybrid of Firefly and Group Search Optimization (HFGSO) in this paper. The FDSMO process includes three stages: (i) applying the Frequent Biological Sequence based on Bitmap (ii) calculate length, width and regular expression (iii) optimization using HFGSO. The exploratory results demonstrate that FDSMO performs better than the existing methods, both in terms of running time and scalability. Keywords: FBSB, DNA sequence, mining, HFGSO, FDSMO, bitmap, Biological sequence 1. Introduction In this computerized world, where the gigantic measure of information is accessible in advanced structure, a substantial measure of information contains both critical and non-noteworthy patterns. Here the principle test is to discover intriguing examples that are useful to decide, which is an exceptionally dull and time taking task. So there emerges the requirement for a robotized innovation, which does this job proficiently and successfully. Successive example, mining strategy is very valuable for this reason [1]. Different scientists have proposed diverse, continuous pattern mining methods. These methods are considered into biological sequence mining, condition based pattern mining, closed pattern mining, and so on [2]. Sequential pattern mining is a crucial job in extensive applications. Their tasks include examining net access patterns, user obtaining patterns, DNA sequences [3], estimation of diseases etc. [4]. In addition, Sequence pattern mining [5, 6] is one of the many fundamental subjects in data mining and is an additional perspective required in association principle mining [7, 8]. The sequential pattern mining algorithm [9], takes action on solving the problem of determining the frequent sequences in a given database [10]. The sequential pattern mining algorithm [9], deals with the issue of deciding the sequences which occur a number of times in a given database [10]. In association rule mining, the mined result is termed as the items that are purchased together regularly in a single transaction [11]. Concerning DNA, clustering is extensively utilized in genome database. In spite of the fact that few methods were proposed already to cluster genome alignments and DNA microarrays [12], there is exact moment research in the region by utilizing DNA calculations for clustering. A couple arrangements are advanced to utilize DNA calculations to work out clustering issues [13]. In addition to this, very few eras observed the individual and joined tries of data mining and soft computing in the domain of Bioinformatics [14]. In the sequence mining of DNA, Soft computing procedures (including neural networks, fuzzy sets, genetic procedures, soft set and rough sets) etc. which can be most utilized. There are various general classification models, like, Naive Bayesian Network [15], [16], [17], Neural Networks, Decision