IJIRST International Journal for Innovative Research in Science & Technology| Volume 1 | Issue 10 | March 2015 ISSN (online): 2349-6010 All rights reserved by www.ijirst.org 252 An Efficient HADOOP Frameworks SQOOP and Ambari for Big Data Processing Mr. S. S. Aravinth Ms. A. Haseenah Begam Assistant Professor Department of Computer Science and Engineering Department of Computer Science and Engineering Knowledge Institute of Technology, Salem Knowledge Institute of Technology, Salem Ms. S. Shanmugapriyaa Ms. S. Sowmya Department of Computer Science and Engineering Department of Computer Science and Engineering Knowledge Institute of Technology, Salem Knowledge Institute of Technology, Salem Mr. E. Arun Department of Computer Science and Engineering Knowledge Institute of Technology, Salem Abstract A Process having a large number of data affects the Operation. Due to this large augment of data, the industries are struggling to store, handle, and analyse the data. The normal data base systems are not enough to do the above mentioned activities.Then here comes the hadoop technology, In Hadoop the enormous data will be stored and processed effectively and efficiently. Hadoop is the technology which has many frameworks such as data integration, management, orchestration, monitoring, data serialization, data intelligence, storage, integration and access. So hadoop technology is used in which Sqoop tool is used ,it is a command-line interface application for transferring data between relational databases and hadoop . In hadoop scoop is the command line interface used for both Import and export from relational database to hadoop. In hadoop another tool called ambari is used. It is used to simplify Hadoop management processing of huge amount of data. It also works for provisioning, managing and monitoring of apache Hadoop clusters. In this paper the sqoop and ambari frameworks have been analysed with various parameters. Keywords: Big Data, hadoop, Ambari , Sqoop, & Data processing _______________________________________________________________________________________________________ I. INTRODUCTION TO SQOOP Apache Sqoop is a tool which is designed for efficient export or import of bulk data between Apache Hadoop and structured datastores. Presently Sqoop is a Top-Level Apache project which is a command line interface application written in java. II. PURPOSE OF SQOOP 1) Sqoop is designed efficiently for the purpose of transferring huge amount of data between Apache hadoop and structured data stores such as relational. 2) It copies data quickly from external systems to hadoop. 3) It enables data imports from external data stores and enterprise data warehouses into hadoop. 4) It ensures fast performance by parallelizing data transfer and utilizes optimal system. 5) Sqoop supports analyses of data efficiently. 6) It even mitigates excessive loads to external systems. III. WORKING OF SQOOP 1) Sqoop runs in hadoop cluster. It has access to hadoopcore.Sqoop use mappers to slice the incoming data. 2) Sqoop will communicate with the database store for fetching information called meta-data from relational datastore. This meta-data is being used for initiating java class by the Sqoop 3) Sqoop gets the metadata from DB store. 4) Sqoop will internally create a JAVA class using JDPC API. Sqoop will compile the java class using JDK and compare jar files. 5) Sqoop tries again to communicate with database store,once the jar files are created in order to find the split column which will enable Sqoop to fetch data from the database.