Improved Text mining for bulk data using Deep learning approach Indumathi A PG Scholar, Department of Computer Science and Engineering, Sri Ramakrishna Engineering College, Coimbatore. Perumal P Professor, Department of Computer Science and Engineering, Sri Ramakrishna Engineering College, Coimbatore. Abstract- Text document clustering and similarity detection is the major part of document management, where every document should be identified by its key terms and domain knowledge. Based on the similarity, the documents are grouped into clusters. For document similarity calculation there are several approaches were proposed in the existing system. But the existing system is either term based or pattern based. And those systems suffered from several problems. To make a revolution in this challenging environment, the proposed system presents an innovative model for document similarity by applying back propagation time stamp algorithm. It discovers patterns in text documents as higher level features and creates a network for fast grouping. It also detects the most appropriate patterns based on its weight and BPTT performs the document similarity measures. Using this approach, the document can be categorized easily. In order to perform the above, a new approach is used. This helps to reduce the training process problems. The above framework is named as BPTT. The BPTT has implemented and evaluated using dot net platform with different set of datasets. 1. INTRODUCTION The capacity of storage data becomes huge amount of the technology of computer hardware develops. So amount of data is increasing exponentially, the information required by the users become varies. Actually users deal with textual data more than the numerical data. It is very difficult to apply techniques of data mining to textual data instead of numerical data. Text miming [1] is finding interesting regularities in large Textual datasets. The text mining studies are gaining more importance recently because of the availability of the increasing number of the documents from a variety of sources. Which include unstructured and semi structured information. The main functions [2] of the text mining include text summarization, text categorization and text clustering. The Text of this paper is restricted to text categorization. “Text mining” is increasingly being used to denote all the tasks that, by analyzing large quantities of text and detecting usage patterns, try to extract probably useful (although only probably correct) information. Fig.1.1 Document classification process Deep learning approach [3] are representation learning methods with multiple levels of representation, but nonlinear modules that methods transforms the representation at one level (starting with the raw input) into a higher representation slightly more abstract level, with the composition of enough such transformations, and very complex functions can be learned. Deep learning approach of learning algorithm, feature extraction can improve the accuracy of learning algorithm and shorten the time. Selection from the document each part can reflect the information on the text classification, and the calculation of weight is called the text feature extraction. 2. RELATED WORK In the recent years, the progress of web and social network technologies have led to a massive interest in the classification of text documents containing links or other meta-information and many studies on classification algorithms have been done by many researches. In this section we will do a review to these works and show the focus points of them. As we will see, the novelty of our work is appears by studying almost all the modification and improvements to each algorithm. Focused [4] on specific changes which are applicable for the text classification. They used, as text classification algorithms, Decision Trees, Pattern (Rule) based Classifiers, SVM Classifiers, Neural Network Classifiers, Bayesian (Generative) Classifiers, nearest neighbor classifiers, and genetic algorithm based classifier. They are discussed the methods used for in text classification and described these methods for text classification. To text classification [5] process of text classification as well as the classifiers and tried to compare the some existing classifier on basis of few criteria like time complexity, principal and performance. The theory and methods of text classification and text mining, the important International Journal of Computer Science and Information Security (IJCSIS), Vol. 16, No. 4, April 2018 251 https://sites.google.com/site/ijcsis/ ISSN 1947-5500