Copyright © 2018 Authors. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. International Journal of Engineering & Technology, 7 (4.36) (2018) 147-153 International Journal of Engineering & Technology Website: www.sciencepubco.com/index.php/IJET Research paper A Survey on Clustering Density Based Data Stream algorithms Mayas Aljibawi*, Mohd Zakree Ahmed Nazri , Zalinda Othman Faculty of Information Science and Technology, Universiti Kebangsaan Malaysia 43600 Bangi, Selangor Darul Ehsan, Malaysia *Corresponding author E-mail: mayasaljibawi@gmail.com Abstract With the rapid evolution of technology, data size has increased as well. Thus, open the door to a new challenge of finding patterns such as the limitation of memory and time and the one pass to the whole data. Many clustering techniques has been developed to overcome these issues. Streaming data evolve with time, and that makes it almost impossible to define clusters number in that data. Density-based algorithm is one of the significant data clustering class to overcome this issue due to it doesn’t require an advance knowledge about the number of clusters. This paper reviewed some of the existing density-based clustering algorithms for the data stream with the measurement used to evaluate the algorithm. Keywords data mining, clustering, density-based clustering, grid-based clustering, micro-clustering, stream data clustering. 1. Introduction The rapid development in the technology make the data size collected from various sources very large. For example, the genome of a single human been can hold up to 4 gigabytes of data space [1], and the amount of data that we create every day reach up to 2.5 quintillion bytes [2].Another huge amount of data can be continually generated from the streaming via different applications. Stream data mining which is referring to extract the structure of the knowledge from the stream, is attracting many researchers because of growing of data stream generation and its application importance [3]. Traditional approaches used to analysis the data are not suitable anymore to be used with the massive amount of the new data. Therefore, demands for new approaches to extract the important information from that data are needed, with a robust techniques for examining, explaining data the get the relevant knowledge that assists in the decision making. 2. Data mining and data clustering 2.1 Data mining It is the method of extracting the unidentified relevant pattern such as unusual records (anomaly detection), cluster analysis and dependencies [4, 5]. Many definitions for the data mining mentioned in the literature are discussed below: [6] Defines Data mining as the approach of finding essential connections, patterns, by moving through the data stored in depository. [4] Says, it is the process of processing voluminous data stored in the database, seeking for patterns and affiliation within that data. [7] Gives another definition for the data mining as the process of picking, discovering, and modeling huge amounts of data to discover previously anonymous patterns of a business advantage. 2.2 Data clustering: Clustering is most suitable techniques to distribute the data into groups of similar objects which are closely related and different with other groups’ objects. The clustering approaches smoothly arrange a set of patterns into the group or clusters on the basis of similarity measures. Cluster techniques are based on an unsupervised approach where data items are unlabeled to group them into valid clusters [4, 5], while in unsupervised approaches, the dataset is given in the form of pre-classified item set. If the dataset is already labeled it help us to create a new label. Figure 1 data mining steps • Clustering: is the process where the data points been partitioning into smaller groups. Each of the formed groups represent a cluster where the objects are similar to each other, while dissimilar to other cluster’s objects. The results from this process referred to as a clustering [3]. • Requirements for Cluster Analysis ➢ Scalability: a lot of literature algorithms can handle small datasets, while databases nowadays consist of millions of objects, that makes high scalability is a must in the clustering algorithm. ➢ Handling different types of attributes: algorithms normally developed to deal with one type of data (numeric, binary, nominal, etc.). However, many applications start to require clustering algorithm for complex types of data. ➢ Discover clusters with different shapes: clustering algorithms usually use either the Euclidean or Manhattan for measuring the distance, then determine the shape of the clusters which normally will be a similar size and density spherical shape cluster. However, the shape of the clusters could be various (e.g.