206 | International Journal of Current Engineering and Technology, Vol.4, No.1 (Feb 2014) Review Article International Journal of Current Engineering and Technology E-ISSN 2277 – 4106, P-ISSN 2347 - 5161 ©2014 INPRESSCO ® , All Rights Reserved Available at http://inpressco.com/category/ijcet Review: Approaches for Handling DataStream Purva S. Gogte Ȧ* and Deepti P. Theng Ȧ Ȧ Department of Computer Science and Engineering, G. H. Raisoni College of Engineering, Nagpur, Maharashtra, India-441110 Accepted 10 January 2014, Available online 01 February 2014, Vol.4, No.1 (February 2014) Abstract Today, there is tremendous use of technology that causes generation of huge volume of data called as Data Stream. Data Stream are continuous, unbounded and usually come with high speed and changes with time. It has different issues such as Memory, Time, Noise, Dynamic data. There is need of handling data streams because of its changing nature, and the data stream may be labelled or it may be unlabelled. Classification is supervised it can only handle labelled data. Thus, there is need of Hybrid Ensemble Classifier in which clustering and classifier are brought together so that the labelled as well as unlabelled datastream both can be handled. This Paper describes different Approaches for Handling DataStream. Keywords: Data Streams, Clustering, Classification 1. Introduction 1 In recent years, many sources of streaming data have been developed. Tens of applications and millions of users access the World Wide Web daily. Moreover, advances in hardware devices, like wireless sensors and mobile devices, led to an increase in the applications that generate streaming data.(Satpute Pravin C,2012).Data Stream is a sequence of continuously arriving data items at a high speed which are real time, implicitly or explicitly ordered by timestamps, evolving and uncertain in nature. Data Stream mining has recently emerged as a growing field of multidisciplinary research. It combines various research areas such as databases, machine learning, artificial intelligence, statistics, automated scientific discovery data visualization, decision science, and high performance computing thus, Data stream classification has been a widely studied research problem in recent years. The dynamic and evolving nature of data stream requires efficient and effective techniques that are significantly different from static data classification techniques. In recent years mining data streams in large real time environments has become a challenging job due to wide range of applications that generate boundless stream of data such as log records, mobile application sensors, emails, blogging, credit card, fraud detection, medical imaging, intrusion detection, weather monitoring, stock trading, planetary remote sensing etc. There are many issues while handling with the data streams which are summarized as follows: i) Large space: Data streams have enormous volumes of continuously incoming data. *Corresponding author: Purva S. Gogte ii) Dynamic data: Data streams are fast, changing, uncertain and require fast response to incorporate changes in data and reflect it in output. iii) Noise: Any approach applied to data streams should be able to deal with noise and outliers. iv) Single scan: Since data streams have infinite volume of information which is fast and changing, hence stream data should be read only once. v) Light weight: Techniques applied to vast data streams should process stream less time and memory to should provide an optimal output Data Stream are nothing but the Big data .The term “Big data” is used for large data sets whose size is beyond the ability of commonly used software tools to capture, manage, and process. Big data sizes are a constantly moving target currently ranging from a few dozen terabytes to many petabytes of data in a single data set. Typical examples of big data found in current scenario includes web logs, RFID generated data, sensor networks, satellite and geo-spatial data, social data from social networks, Internet text and documents, Internet search indexing, call detail records, astronomy, atmospheric science, genomics, biogeochemical etc. Big Data has emerged because we are living in a society which makes increasing use of data intensive technologies. There are many Big data problems such as it is difficult to use relational databases with big data. The various challenges faced in large data management include scalability, unstructured data, accessibility, real time analytics, fault tolerance and many more. In addition to variations in the amount of data stored in different sectors, the types of data generated and stored i.e., whether the data encodes video, images, audio, or text/numeric information also differ markedly from industry to industry