International Journal of Emerging Technology and Advanced Engineering Website: www.ijetae.com (ISSN 2250-2459, ISO 9001:2008 Certified Journal, Volume 3, Issue 2, February 2013) 102 A Novel Uncertain Fuzzy C-Means Clustering Technique Using Genetic Algorithm (UFCM-GA) Sandhya Rawat 1 , Ajit Kumar Shrivastava 2 , Amit Saxena 3 1 Department of C.S.E,Truba Engineering College,Truba, Bhopal, (M.P.), India 2 Academic Dean, Truba Engineering College,Truba, Bhopal, (M.P.), India 3 Departmental Head of C.S.E,Truba Engineering College,Truba,Bhopal (M.P.), India Abstract-- In computer science, uncertain data is the notion of data that contains specific uncertainty. Uncertain data is typically found in the area of sensor networks. When representing such data in a database, some indication of the probability of the various values. There is a growing awareness of the need for database systems to be able to handle and correctly process data with uncertainty. The uncertainty is normally evaluated as probability density functions. Beyond storing and processing such data in a DBMS, it is necessary to perform other data analysis tasks such as data mining. Fuzzy clustering is a class of algorithms for cluster analysis in which the allocation of data points to clusters is not "hard" (all-or-nothing) but "fuzzy" in the same sense as fuzzy logic. A genetic algorithm (GA) is a search heuristic that mimics the process of natural evolution. This heuristic is routinely used to generate useful solutions to optimization and search problems. Genetic algorithms belong to the larger class of evolutionary algorithms (EA), which generate solutions to optimization problems using techniques inspired by natural evolution. In this paper we proposed Uncertain Fuzzy C-Means Clustering using Genetic Algorithm (UFCM-GA). Our proposed mechanism is applicable to any uncertainty region. The experimental results analysis showed the effectiveness compared with existing works. Keywords-- Uncertain Data Mining, Data Uncertainty, UFCM, Genetic Algorithm. I. INTRODUCTION In the last decades, the amount of collected data in information and database systems has increased tremendously. To analyze this enormous amount of data, the interdisciplinary field of Knowledge Discovery in Databases (KDD) has emerged. The field of KDD combines disciplines like database systems [1-3]. To identify individuals, similarity search on feature vectors is applicable, but even the use of adaptable distance measures is not enough to handle objects having an individual level of exactness. The uncertainty in object description appears also in many other advanced database systems like moving object, and sensor database systems, because no exact values to describe the data objects are available. Instead, the feature values are considered to be uncertain. This uncertainty is modeled by probability distributions instead of exact feature values. A typical application of such an uncertainty model are moving objects where the exact position of each object can be determined only at discrete time intervals. Queries often involve the position of objects between two time stamps or after the last known time stamp. Then, the objects are essentially uncertain unless the pattern of movement is very simple. The same problem exists, for instance, in sensor networks where continuously changing values such as temperature or wind speed can be measured at discrete time intervals only [4-6]. In this paper we proposed an Uncertain Fuzzy C-Means Clustering using Genetic Algorithm. II. BACKGROUND TECHNIQUES Modern Technological Infrastructure is required for Data Mining: Today, data mining applications are available on all size systems for mainframe, client/server, and PC platforms. System prices range from several thousand dollars for the smallest applications up to $1 million a terabyte for the largest. Enterprise-wide applications generally range in size from 10 gigabytes to over 11 terabytes. NCR has the capacity to deliver applications exceeding 100 terabytes. There are two critical technological drivers: Size of the database: the more data being processed and maintained, the more powerful the system required. Query complexity: the more complex the queries and the greater the number of queries being processed, the more powerful the system required.