ISSN: 2320-5407 Int. J. Adv. Res. 5(4), 174-181 174 Journal Homepage: - www.journalijar.com Article DOI: 10.21474/IJAR01/3793 DOI URL: http://dx.doi.org/10.21474/IJAR01/3793 RESEARCH ARTICLE CLASSIFICATION OF WEB DOCUMENTS USING HYBRID FEATURE SELECTION. V. David Martin 1 and Dr. T. N. Ravi 2 1. Research Scholar, Periyar E.V.R College (Autonomous), Trichy. 2. Assistant Professor, Periyar E.V.R. College (Autonomous), Trichy …………………………………………………………………………………………………….... Manuscript Info Abstract ……………………. ……………………………………………………………… Manuscript History Received: 01 February 2017 Final Accepted: 10 March 2017 Published: April 2017 Key words:- Particle Swarm Optimization, Relative Reduct Knowledge discovery and data mining is a process of retrieving the meaningful knowledge from the raw data, using different techniques. Therefore, text mining is a sub domain of knowledge discovery from the text data. Web mining is a one class of data mining. Web Mining is a variation of data mining that distills untapped source of abundantly available free textual information. The need and importance of web mining is growing along with the massive volumes of data generated in web day-to-day life. Feature selection is an effective technique for dimension reduction and an essential step in successful data mining applications. It is a research area of great practical significance and has been developed and evolved to answer the challenges due to data of increasingly high dimensionality. In this paper, a hybrid feature selection is proposed. The Relative Reduct and Particle Swarm Optimization Technique are hybridized to reduce the size of the feature space Copy Right, IJAR, 2017,. All rights reserved. …………………………………………………………………………………………………….... Introduction:- In the modern days of technology text mining studies are advancing into next level due to mounting number of the electronic documents from a mixture of resources. The resources of unstructured and semi structured data includes the World Wide Web, Governmental Electronic Repositories, News Editorials, Genetic Directory , Depositories of Blog, Online Forums, Digital Libraries, Electronic Mail and Chat Rooms. Consequently, the appropriate categorization and knowledge detection from these sources and it marks a major role in the field for investigation. Natural Language Processing (NLP), Information Mining, and Machine Learning methods work reciprocally in categorizing the determine patterns instinctively from the electronic documents. The primary objective of the text mining is to facilitate clients to extort information from textual resources and compacts with the maneuvers like, repossession, categorization (supervised, unsupervised and semi supervised) and recapitulation. In contrast, how these documents can be aptly interpreted, presented and classified. In view of that, it consists of numerous challenges, like proper explanation to the documents, with appropriate file demonstration, dimensionality diminution to grip algorithmic concerns [1]. Moreover a suitable classifier jobs are occupied to accomplish good overview and evade over-fitting. Mining, incorporation and categorization of electronic data from miscellaneous sources and knowledge discovery have been composed from these documents to channel it for the research societies. At present, the web is the chief source for the text documents, the quantity of textual data existing to us is constantly mounting, and approximately 80% of the data of an organization is piled up in unstructured textual format [2] like Corresponding Author:- V. David Martin. Address:- Research Scholar, Periyar E.V.R College (Autonomous), Trichy.