IJSER © 2017 http://www.ijser.org Challenges and Security Issues in Implementation of Hadoop Technology in Current Digital Era Dr. Vinay Kumar, Ms. Arpana Chaturvedi AbstractWith the advent of technologies, managing tremendous amount of over flown and exponentially growing data is a major area of concern today. This is particularly in terms of storing and organizing data with security. The exponentially growing data due to Internet of Things (IoT) has led to many challenges for the governmental and non governmental organizations (NGOs). Security threats forced to the private and public organizations to develop their own Hadoop based cloud storage architecture .In Apache Hadoop architecture it creates various clusters of machines and efficiently coordinates the work among them. Hadoop Distributed File System-HDFS and Map Reduce are two important components of Hadoop. HDFS is the primary storage system used by different applications of Hadoop.It enables reliable and extremely rapid computations. HDFS provides rich and high availability of data to different user applications running at the client end. Map Reduce is a software framework for analyzing and transforming a very large data set into desired output. This paper focuses on the review of HDFS 0, HDFS 2.0 and HDFS 2.8 architecture, and its various functionalities including analytical and security features. Index TermsCloud Computing, Clusters, Hadoop, HDFS, Hive, IoT, Map Reduce Pig, Sqoop. —————————— —————————— 1 INTRODUCTION adoop is an open source architecture which is used to store the structured, semi structured, unstructured, quasi structured data ,collectively such data is termed as big data.It provides meaningful output using data analytics. The standard process used to work with big data is ETL (Extract, Transform and Load).Extraction means getting data from multiple sources, Transform means convert it to fit into analytical needs and Load means getting it into the right systems to derive mea- ningful value out of it. It provides various benefits to govern- mental as well as non governmental organizations. The col- lected data is of two types, operational data and analytical data. The different types of data comes under two categories are: Transactional data, generated from all daily transactions, Social Data-generated from different social networking sites like Face book, Google ads etc. Sensor or Machine Data- gen- erated by industrial equipment, sensors that are installed in machines, data stored in black box in aviation industry, web logs which tracks the user behaviors, medical devies, smart meters, road cameras, satellite, games and many more Internet of Things .All Government organizations are now-a-days get- ting digitized and aadhar enabled.Aadhar enabled applica- tions will provides better services and facilities to the right person as an individual and let the citizens participate in digi- tal economy. To implement digitization in different organiza- tion and to utilize all the benefits now-a-days companies are moving towards Hadoop technology from existing one.Hadoop is a highly scalable platform developed in JAVA, which consists of distributed File system that allows multiple concurrent jobs to run on multiple servers splitting and trans- ferring data and files between different nodes. It is efficient to process or recover the stored data without any delay in case of failure of any node. At the same time chances of fraudulence increases while processing or storing information in HDFS.Due to various big data issues with respect to manage- ment, storage, processing and security, it is necessary to deal with all individually [8]. This paper is organized into five sections.Secion 2 deals with literature review. Hadoop File system, its architecture and components are discussed in section 3. Existing problem and the challenges are outlined in Section 4 and paper is finally concluded with the proposed solution in the section 5. ———————————————— Vinay Kumar is a Professor in Vivekananda Institute of Profes- sional Studies, Delhi. Earlier he worked as Scientist in NlC, Mo- CIT Government of India. He completed his Ph.D. in Computer Science from University of Delhi and MCA from Jawaharlal Ne- hru University, Delhi.He is member of CSI and ACM. Ph: 011- 2734 3402. E-Mail:vinay5861@gmail.com Arpana Chaturvedi is working as an Assistant Professor in Ja- gannath International Management School, Delhi. She is M.Sc. (Math), MCA and M. Phil. (Comp. Sc). She is pursuing PhD from Jagannath University. PH-01149219191. E-mail: ac240871@gmail.com H International Journal of Scientific & Engineering Research, Volume 8, Issue 4, April-2017 ISSN 2229-5518 984 IJSER