International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 02 Issue: 03 | June-2015 www.irjet.net p-ISSN: 2395-0072 © 2015, IRJET.NET-All Rights Reserved Page 1623 AN INTRODUCTION TO MAP REDUCE APPROACH TO DISTRIBUTE WORK USING NEW SET OF TOOLS Mr. Narahari Narasimhaiah 1 , Dr. R. Praveen Sam 2 . 1 Research scholar, Bharathiar University, Tamilnadu, India. 2 Professor, Dept. of CSE. Andhra Pradesh, India. --------------------------------------------------------------------------------------------------------------------------------------------------------- ABSTRACTION- Using a specialized query language on highly structured and optimiazed data structures, all processing happens after the information has been loaded into the stores to the traditional relational database[3] world. Across many machines with the computation spread, with intermediate results being passed between stages as files. Instead create a pipeline that reads and writes to arbitrary file formats, this is the Googleapproach, and adopted by many web companies. Typically based around the Map- Reduce[1] approach to distributing work, this approach requires a whole new set of tools, which I’ll describe below. Key Words: Relational, Distributing, Map Reduce etc… 1. INTRODUCTION Fig - 1:Mappers and reducers functionality. Figure 1 is showing, the map and reduce combinatory concept in functional programming languages such as Lisp of the Map Reduce model. In Lisp, a map takes as a sequence of values and input a function. It applies the function to each value in the sequence. All the elements of a sequence using a binary operation combines inreduce. In 2004 Google introduced the Map Reduce framework to support distributedprocessing on large data sets distributed over clusters of computers. Currently it is an integral[3] part of the Hadoop ecosystem, but it was implemented by many software platforms. Map Reduce was introduced and specifically designed to run on commodity hardware and to solve large-data computational problems. The input data sets, based on divide-and-conquer[4] principles, these are split into independent chunks, which are processed by the mappers in parallel. Fig -2:Map Reduce Process. Based on the user-supplied code, to provide the overall coordination of execution is the responsibility of the Map Reduce framework. This includes choosing appropriate machines (nodes) for running mappers; choosing appropriate locations for the reducer’s execution; starting and monitoring[2] the mappers execution; sorting[3] and shuffling output of mappers and delivering the output to reducer nodes; and starting and monitoring the reducer’s execution. 2. FUNCTIONAL PROGRAMMINGCONCEPTS MapReduce programs are designedin a parallel fashion to compute large volumes of data. Across a large number of machines, this requires dividing the workload. If the components[1] were allowed to share data arbitrarily, this model would not scale to large clusters (hundreds or